CONCEPT Cited by 1 source
Huge-page allocator (HPA)¶
Definition¶
A huge-page allocator is the subsystem of a userspace memory allocator responsible for serving allocations backed by huge pages (typically 2 MiB or 1 GiB on x86-64; 2 MiB / 32 MiB / 1 GiB on AArch64) rather than the default 4 KiB base page. In jemalloc the subsystem is named HPA.
Why it matters¶
At hyperscale, a non-trivial fraction of CPU cycles is spent on TLB (Translation Lookaside Buffer) misses — the CPU's virtual→physical address cache. A 4 KiB TLB entry covers 4 KiB; a 2 MiB huge-page entry covers 512× more memory. For long-lived processes with large working sets (typical hyperscale servers), switching to huge pages reduces TLB miss rate, which reduces page-walk CPU cost, which translates directly into fleet-wide CPU efficiency.
Delivery mechanism: transparent vs explicit¶
- Explicit huge pages —
mmap(MAP_HUGETLB)against preconfiguredhugetlbfsmounts. Application-visible, requires root / sysadmin setup, not drop-in. - Transparent Huge Pages
(THP) — Linux kernel feature (since 2.6.38) that
transparently merges contiguous 4 KiB pages into 2 MiB
huge pages. Application-invisible;
MADV_HUGEPAGEhints the kernel where to try harder.
jemalloc's HPA primarily targets THP: give the kernel contiguous virtual ranges aligned on huge-page boundaries and hint appropriately, so the kernel can transparently promote allocations to huge pages with zero application config.
Trade-offs¶
- Internal fragmentation — a 2 MiB page serving small allocations wastes memory on unused tail. HPA design must balance TLB win against fragmentation cost via careful size-class + purge policy.
- Memory pressure behaviour — huge pages are harder for the kernel to reclaim; under pressure, THP promotion can be demoted back to 4 KiB with per-page metadata cost.
- NUMA locality — huge pages are harder to migrate across NUMA nodes than 4 KiB pages; a naive HPA can worsen NUMA imbalance.
Seen in¶
- sources/2026-03-02-meta-investing-in-infrastructure-jemalloc-renewed-commitment — Meta's 2026 jemalloc roadmap names HPA improvements as one of four headline focus areas, targeting better THP utilisation for "improved CPU efficiency." No specific perf targets disclosed in this announcement-voice post.
Related¶
- systems/jemalloc — the allocator whose HPA subsystem is the canonical wiki instance.
- concepts/transparent-huge-pages — the Linux kernel feature HPA primarily targets.
- systems/arm64-isa — on AArch64, huge-page size set varies by kernel page-size configuration (4K / 16K / 64K base pages → different huge-page sets), making out-of-box HPA tuning more complex than on x86-64.