PATTERN Cited by 1 source
File-watcher atomic-swap consolidated map¶
Problem¶
An in-process server module needs to maintain a consolidated lookup structure — e.g., model_name → version → feature_allowlist — assembled from multiple independently-updated source artefacts on disk (one file per model bundle). Requirements:
- Reads must be fast and lock-free enough for the hot path (millions of lookups per second at Pinterest's scale).
- Writes arrive asynchronously (bundle deploys roll out independently) and must be visible without restart.
- Concurrent updates must not corrupt the consolidated view — multiple bundles can refresh in overlapping windows.
- A corrupt or partial update to one source must not poison the consolidated view for all others.
Solution¶
Three-layer structure with file watchers, per-source maps, a consolidated map, and atomic swap under a read-write lock.
Independent source maps[bundle] (one per on-disk artefact)
▲
└── file watcher per artefact → triggers reload of that map only
(other bundles' maps untouched)
Consolidated map (the hot-path read surface)
▲
└── rebuilt from ALL independent maps on any change
└── atomically replaces the current active consolidated map
RW lock
├── shared lock — reads of consolidated map + independent maps
└── unique lock — the atomic swap of the consolidated map
Pinterest's articulation¶
From the 2026-05-01 Feature Trimmer post (Source: sources/2026-05-01-pinterest-optimizing-ml-workload-network-efficiency-part-i-feature-trimmer):
"Configuration: The root cluster is configured with the active model bundles, and the file path for each corresponding
module_info.jsonis set using GFlags. Initial Loading: The feature trimmer module loads the content of eachmodule_info.jsonfile into an independent in-memory map. Monitor for Content Updates: A file watcher is attached to eachmodule_info.json. Any content refresh triggers a reload of its contents into the in-memory map for the given model bundle. Consolidation: On initial loading or when any model bundle is refreshed, the module: Scans and merges all independent maps. Creates a new consolidated map. Atomically replaces the current active consolidated map with the new one. Concurrency Management w/ Read-Write Lock: Concurrent reads of the consolidated and independent maps are managed with a shared lock. Write access during the map replacement is managed with a unique lock."
Why the two-layer design (per-bundle maps + consolidated map)¶
The pattern could be simplified to a single consolidated map updated in place — but that would couple all bundles' update risk together. The two-layer design gives:
Failure isolation per bundle¶
If bundle A's module_info.json gets corrupted on disk during an update, the trimmer's independent map for bundle A stays on the old version (file-watcher sees the corruption or parse error and keeps the prior content). Bundles B, C, D are unaffected; their refreshes continue to trigger full consolidated-map rebuilds. A "bad bundle" fails in isolation.
Pinterest's explicit framing: "If a model bundle's file gets corrupted on disk during an update, the feature trimmer keeps using the old, in-memory version for that bundle. Because each bundle has its own map, the feature trimmer can still successfully update the information for all the other model bundles."
Lock-free-enough reads¶
The hot path (every score request does a trim lookup) takes only the shared read lock on the consolidated map. Atomic swap happens under a unique lock for microseconds — just enough to swap a pointer. Reads never wait on bundle parsing, which is the slow part of the refresh.
No partial-state reads¶
The consolidated map is never partially rebuilt in place. A read under shared lock sees either the old map or the new map — never a half-merged state where some models are on new allowlists and others are on stale ones.
When it fits¶
- Read-heavy + infrequent writes. Score requests happen millions of times per second; bundle refreshes happen hourly to daily.
- Multiple independent update sources that should be decoupled for failure isolation.
- The consolidated view is small enough to fit in memory and cheap enough to rebuild on any source change (Pinterest's
module_info.jsonis kilobytes per bundle). - Strong read consistency within a single request — one score request sees one coherent view.
- Restart-free hot reload required — the config updates without taking the host offline.
When it doesn't fit¶
- Consolidated view is large (gigabytes). Full rebuild on every update becomes expensive; use differential updates.
- Writes are the common case, reads are rare. A consolidated-map design assumes the inverse.
- Cross-source ordering matters — the pattern consolidates maps without any cross-source transactional guarantee beyond "the snapshot is internally consistent at rebuild time."
- Updates need confirmation back to the publisher — this is one-way push-from-disk; no ack semantics.
Failure modes¶
- Thundering-herd refresh if many bundles update simultaneously. Rebuilds serialise under the unique lock but the rebuild work itself may contend. Mitigated at Pinterest scale because bundle deploys are staged (canary → prod) rather than simultaneous.
- Stale-bundle silent persistence — if a bundle's update silently corrupts and no on-call alert fires, the trimmer runs on a stale allowlist for that bundle indefinitely. Pinterest mitigates at init ("failures while parsing the required
module_infoartifacts are emitted to our observability dashboard and trigger an on-call alert") but not continuously at runtime. - Consolidation-time corruption — a bug in the merge logic could produce a broken consolidated map. The atomic swap discipline doesn't protect against this; only testing + validation at merge time does.
- Host-launch blocked on bundle parsing — Pinterest explicitly chose not to block host launch on parse failure, because doing so "would undermine our ability to respond to capacity-related incidents."
Seen in¶
- 2026-05-01 Pinterest — Optimizing ML Workload Network Efficiency (Part I): Feature Trimmer (sources/2026-05-01-pinterest-optimizing-ml-workload-network-efficiency-part-i-feature-trimmer) — canonical; file watcher per
module_info.json→ independent per-bundle maps → consolidated merge → atomic swap → RW lock semantics → per-bundle failure isolation.
Sibling patterns¶
- patterns/hot-swap-retrofit — sibling pattern at the component-swap altitude; runtime replacement of live components.
- patterns/runtime-backend-swap-on-failure — sibling at the backend-failover altitude; swap under failure, here swap under config update.
- Copy-on-write data structures — same atomic-swap principle at data-structure altitude.
- Immutable-collection-with-atomic-reference (Clojure atoms, Scala
AtomicReference[Map]) — same principle expressed in language primitives.
Related¶
- systems/pinterest-feature-trimmer — the canonical instance.
- concepts/data-plane-atomicity — the underlying property this pattern preserves on the consolidated map.
- concepts/blast-radius — per-bundle isolation contains the blast radius of a bad update.
- patterns/hot-swap-retrofit — sibling at component altitude.
- patterns/runtime-backend-swap-on-failure — sibling at backend altitude.