PATTERN Cited by 1 source

Custom data structure for hot path¶

Intent¶

When stdlib or well-known crate data structures don't match your workload shape on a genuine hot path, write one that does. Don't force-fit a popular structure into the wrong workload because it's popular.

Context¶

Off-the-shelf data-structure crates are written for the usual workload shape of that type:

HashMap — general-purpose, balanced reads/writes.
BTreeMap — sorted iteration, logarithmic reads/writes.
radix_trie — typed-autocomplete / keyboard workloads.
regex — expressive pattern language, not raw-speed matching.

At CDN-scale hot paths, the workload can be very narrow: static key set, short keys, >90 % miss rate, nanosecond budget. None of the above are tuned for exactly that, and the gap shows up as real cores on the fleet bill.

Mechanism¶

Design a structure that exploits the specifics of the workload:

Cache locality. If the whole data fits in L1/L2, fold the entire structure into one contiguous allocation rather than pointer-chasing nodes.
Bit tricks. Pack node-relationship metadata into the bits of a machine word (u32/u64) so traversal is shift-and-mask, not pointer dereference.
Workload-specific asymptotics. On a miss-heavy workload, beat the hit-optimized general case — even if hits get worse in the process.
Static key set at build time. When keys don't change, precompute optimal layout; no dynamic insert cost.

The output is usually a small crate (100s-1000s of lines), open-sourced as a specific-purpose tool, not a general library.

Canonical instances¶

Cloudflare trie-hard (2024)¶

systems/trie-hard replaced a HashMap<HeaderName> in systems/pingora-origin's clear_internal_headers. Design choices:

Node edges as bits of unsigned integers (not child-pointer arrays).
Whole trie in one contiguous memory chunk (not per-node allocations).
Built for short static keys + >90 % miss rate — the specific shape of HTTP-header membership checks.

Result: 1.53 µs (HashMap) → 0.93 µs (trie-hard) → 1.28 % total CPU saving = ~550 cores on a 40,000-core fleet. radix_trie from crates.io benchmarked a full microsecond slower than the HashMap it was supposed to replace — the custom crate was the only path.

Figma flat-sorted-Vec (2026)¶

concepts/small-map-as-sorted-vec — Figma Multiplayer replaced BTreeMap<u16, u64> with Vec<(u16, u64)> for per-node property maps. Design choices:

Flat contiguous storage vs pointer-chased tree nodes.
Schema-bounded n < 200 and input sorted on the wire — so the theoretically-worse O(n) insert never fires on the load path.

Result: ~25 % memory drop on large files, deserialization faster than BTreeMap despite the Big-O regression. Server fleet memory cost down ~20 %, p99 file-load latency up 20 %.

Anti-patterns¶

Writing a custom data structure before measuring. The crate exists for a reason; if the stock one is within 10 %, ship the stock one. Custom structures are justified when the gap is at least 2× and the volume is high enough to matter.
Not benchmarking against real workload. A "faster trie" on random English words may not be faster on 20-byte ASCII HTTP header names with a 90 % miss rate. Match the input distribution.
Generalizing the crate too early. trie-hard is narrow on purpose — static key set, miss-heavy, short keys. Trying to make it also handle dynamic inserts / long keys / hit- dominated workloads destroys the reason it won.
Skipping the OSS release. Open-sourcing the result is a peer-review step and a future hiring signal; the engineering cost was already paid.

Seen in¶

sources/2024-09-10-cloudflare-a-good-day-to-trie-hard — Cloudflare's trie-hard crate replacing HashMap in pingora- origin's clear_internal_headers; the archetypal "off-the- shelf crates were all slower, so we wrote our own" case.
sources/2026-04-21-figma-supporting-faster-file-load-times-memory-optimizations-rust — Figma's Multiplayer flat-sorted-Vec replacing BTreeMap for per-node property maps; same cache-locality-over- asymptotics lesson at a different substrate.

patterns/measurement-driven-micro-optimization — the methodology that justifies custom crates.
concepts/trie-data-structure — category of which trie-hard is a specific optimized instance.
concepts/small-map-as-sorted-vec — sibling pattern for small-N associative containers.
concepts/cache-locality — the main reason custom structures win at µs scales.
systems/trie-hard — canonical crate.