CONCEPT Cited by 1 source
Immutable index state¶
Definition¶
Immutable index state is a state-management discipline for search engines (and any similarly-shaped service) in which the current state is held in an immutable object, and every state change produces a new immutable state object that is atomically swapped in as the current reference after being committed to the state backend. Client requests that have already captured a reference to the state keep operating on the old snapshot; subsequent requests see the new one. No in-place mutation of live state.
Why the immutability buys¶
Three independent properties fall out of immutable-state-plus-atomic-swap:
1. Intra-request consistency¶
"Client requests retrieve the current state once, and reference it for the remainder of the request. Since the state object is immutable, changes will not be visible during the processing of a single request." (Source: sources/2025-05-08-yelp-nrtsearch-100-incremental-backups-lucene-10)
The alternative — live-mutable state — means state values sampled at different points during a request's execution can disagree with each other, leading to inconsistencies and edge cases that are hard to test for. Immutability shifts the problem from "manage read/write conflicts" to "grab one reference, use it everywhere."
2. Atomic visibility of committed changes¶
Because the swap is a single pointer update that happens only after the commit to the state backend succeeds:
- partial changes never become visible;
- state changes can't be observed before they're committed (durable);
- crash during the apply merges into the old state gracefully — readers never saw the half-built new one.
Yelp puts it: "After the change is committed to the state backend (EBS or S3), the new state atomically replaces the reference to the old state. This prevents changes from being observed before they are committed."
3. Lock-free readers¶
Readers don't need locks because they never observe a state being mutated — they observe either the pre-change snapshot or the post-change snapshot, never an in-flight one. The only synchronisation is the writer's commit+swap sequence.
Canonical instance: Yelp Nrtsearch 1.0.0¶
Pre-1.0 Nrtsearch applied state changes in place and in an non-isolated way: "state values may change when sampled multiple times during the processing of a single request. This could lead to inconsistencies and more edge cases to handle."
The 1.0 redesign makes index state an immutable representation that is rebuilt from scratch per change (merge old + change → new immutable object), committed to the state backend, then atomically swapped. This composes with two adjacent changes in the same release:
- decoupled state commit from data commit — state changes are durable per-request, not bundled into a data commit;
- hot reload on replicas — replicas can observe the swap without restarting.
Shape vs. adjacent primitives¶
- Immutable state ≠ copy-on-write pages. Immutable state is coarse-grained (whole index-state object swapped), conceptually similar to COW forks but at a higher level and without the storage-layer overhead of page refcounting.
- Immutable state ≠ MVCC. MVCC keeps multiple versions around and resolves which one a reader sees via snapshot isolation over a timestamp or xmin/xmax; immutable index state keeps one old + one new during the swap window and relies on the atomic pointer update.
- Immutable state ≠ event sourcing. Event-sourced systems store every change as an event and derive current state by replay; immutable-state systems store only the current object (or a small chain), with a separate commit log on the side for recovery.
Seen in¶
- sources/2025-05-08-yelp-nrtsearch-100-incremental-backups-lucene-10 — canonical wiki instance. Nrtsearch 1.0.0's state-management overhaul makes index state an immutable object rebuilt per change and atomically swapped after commit; intra-request consistency, atomic visibility of committed changes, and lock-free readers all follow.