PATTERN Cited by 1 source

File-watcher atomic-swap consolidated map¶

Problem¶

An in-process server module needs to maintain a consolidated lookup structure — e.g., model_name → version → feature_allowlist — assembled from multiple independently-updated source artefacts on disk (one file per model bundle). Requirements:

Reads must be fast and lock-free enough for the hot path (millions of lookups per second at Pinterest's scale).
Writes arrive asynchronously (bundle deploys roll out independently) and must be visible without restart.
Concurrent updates must not corrupt the consolidated view — multiple bundles can refresh in overlapping windows.
A corrupt or partial update to one source must not poison the consolidated view for all others.

Solution¶

Three-layer structure with file watchers, per-source maps, a consolidated map, and atomic swap under a read-write lock.

Independent source maps[bundle]       (one per on-disk artefact)
          ▲
          └── file watcher per artefact → triggers reload of that map only
                                           (other bundles' maps untouched)

Consolidated map                       (the hot-path read surface)
          ▲
          └── rebuilt from ALL independent maps on any change
          └── atomically replaces the current active consolidated map

RW lock
  ├── shared lock  — reads of consolidated map + independent maps
  └── unique lock  — the atomic swap of the consolidated map

Pinterest's articulation¶

From the 2026-05-01 Feature Trimmer post (Source: sources/2026-05-01-pinterest-optimizing-ml-workload-network-efficiency-part-i-feature-trimmer):

"Configuration: The root cluster is configured with the active model bundles, and the file path for each corresponding module_info.json is set using GFlags. Initial Loading: The feature trimmer module loads the content of each module_info.json file into an independent in-memory map. Monitor for Content Updates: A file watcher is attached to each module_info.json. Any content refresh triggers a reload of its contents into the in-memory map for the given model bundle. Consolidation: On initial loading or when any model bundle is refreshed, the module: Scans and merges all independent maps. Creates a new consolidated map. Atomically replaces the current active consolidated map with the new one. Concurrency Management w/ Read-Write Lock: Concurrent reads of the consolidated and independent maps are managed with a shared lock. Write access during the map replacement is managed with a unique lock."

Why the two-layer design (per-bundle maps + consolidated map)¶

The pattern could be simplified to a single consolidated map updated in place — but that would couple all bundles' update risk together. The two-layer design gives:

Failure isolation per bundle¶

If bundle A's module_info.json gets corrupted on disk during an update, the trimmer's independent map for bundle A stays on the old version (file-watcher sees the corruption or parse error and keeps the prior content). Bundles B, C, D are unaffected; their refreshes continue to trigger full consolidated-map rebuilds. A "bad bundle" fails in isolation.

Pinterest's explicit framing: "If a model bundle's file gets corrupted on disk during an update, the feature trimmer keeps using the old, in-memory version for that bundle. Because each bundle has its own map, the feature trimmer can still successfully update the information for all the other model bundles."

Lock-free-enough reads¶

The hot path (every score request does a trim lookup) takes only the shared read lock on the consolidated map. Atomic swap happens under a unique lock for microseconds — just enough to swap a pointer. Reads never wait on bundle parsing, which is the slow part of the refresh.

No partial-state reads¶

The consolidated map is never partially rebuilt in place. A read under shared lock sees either the old map or the new map — never a half-merged state where some models are on new allowlists and others are on stale ones.

When it fits¶

Read-heavy + infrequent writes. Score requests happen millions of times per second; bundle refreshes happen hourly to daily.
Multiple independent update sources that should be decoupled for failure isolation.
The consolidated view is small enough to fit in memory and cheap enough to rebuild on any source change (Pinterest's module_info.json is kilobytes per bundle).
Strong read consistency within a single request — one score request sees one coherent view.
Restart-free hot reload required — the config updates without taking the host offline.

When it doesn't fit¶

Consolidated view is large (gigabytes). Full rebuild on every update becomes expensive; use differential updates.
Writes are the common case, reads are rare. A consolidated-map design assumes the inverse.
Cross-source ordering matters — the pattern consolidates maps without any cross-source transactional guarantee beyond "the snapshot is internally consistent at rebuild time."
Updates need confirmation back to the publisher — this is one-way push-from-disk; no ack semantics.

Failure modes¶

Thundering-herd refresh if many bundles update simultaneously. Rebuilds serialise under the unique lock but the rebuild work itself may contend. Mitigated at Pinterest scale because bundle deploys are staged (canary → prod) rather than simultaneous.
Stale-bundle silent persistence — if a bundle's update silently corrupts and no on-call alert fires, the trimmer runs on a stale allowlist for that bundle indefinitely. Pinterest mitigates at init ("failures while parsing the required module_info artifacts are emitted to our observability dashboard and trigger an on-call alert") but not continuously at runtime.
Consolidation-time corruption — a bug in the merge logic could produce a broken consolidated map. The atomic swap discipline doesn't protect against this; only testing + validation at merge time does.
Host-launch blocked on bundle parsing — Pinterest explicitly chose not to block host launch on parse failure, because doing so "would undermine our ability to respond to capacity-related incidents."

Seen in¶

2026-05-01 Pinterest — Optimizing ML Workload Network Efficiency (Part I): Feature Trimmer (sources/2026-05-01-pinterest-optimizing-ml-workload-network-efficiency-part-i-feature-trimmer) — canonical; file watcher per module_info.json → independent per-bundle maps → consolidated merge → atomic swap → RW lock semantics → per-bundle failure isolation.

Sibling patterns¶

patterns/hot-swap-retrofit — sibling pattern at the component-swap altitude; runtime replacement of live components.
patterns/runtime-backend-swap-on-failure — sibling at the backend-failover altitude; swap under failure, here swap under config update.
Copy-on-write data structures — same atomic-swap principle at data-structure altitude.
Immutable-collection-with-atomic-reference (Clojure atoms, Scala AtomicReference[Map]) — same principle expressed in language primitives.

systems/pinterest-feature-trimmer — the canonical instance.
concepts/data-plane-atomicity — the underlying property this pattern preserves on the consolidated map.
concepts/blast-radius — per-bundle isolation contains the blast radius of a bad update.
patterns/hot-swap-retrofit — sibling at component altitude.
patterns/runtime-backend-swap-on-failure — sibling at backend altitude.