CONCEPT Cited by 4 sources
Immutable Object Storage¶
Immutable object storage is a model in which the stored unit — the object — cannot be partially modified after it is written. Writes produce new versions (or new objects); reads see whole-object values; updates are whole-object replacements, not in-place mutations.
This is the low-level data model that S3 and most cloud object stores expose. "An HTTP-based storage system for immutable objects with four core verbs (PUT, GET, DELETE and LIST)" — Warfield.
(Source: sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes)
Why immutability¶
- Simple replication and durability semantics. An object is either the old one or the new one; there is no partial-update window to reason about across replicas.
- Easy versioning. Because overwrites are replacements, supporting "keep old versions" is a natural extension (S3 object versioning).
- Low write-coordination cost. No need for distributed locks for sub-object ranges.
- Good substrate for higher-level abstractions — see systems/apache-iceberg building a mutable "table" on top of immutable Parquet objects via snapshotting.
What immutability forces upward¶
Mutability still needed by many workloads is forced above the storage layer:
- Row-level updates → concepts/open-table-format (Iceberg / Delta / Hudi) built over immutable Parquet.
- Schema evolution → metadata layers that describe table state across many objects.
- Atomic multi-writer coordination → conditional operations / CAS on object metadata (patterns/conditional-write) rather than in-place edits.
- "Mutable" semantics in client libraries (e.g. embedded DB files on S3) → typically implemented as whole-file replace with conditional-write guards.
The tension the S3-at-19 post calls out¶
"Objects are simple and immutable, but tables are neither."
That single sentence is the whole argument for S3 Tables: once customers need table semantics over immutable objects, someone has to own the mutable-on-top-of-immutable glue (metadata, compaction, GC). Either the customer owns it via Iceberg client code, or the platform owns it via systems/s3-tables.
The file-semantics escape hatch (2026)¶
systems/s3-files introduces a second way out: instead of building
mutability over immutable objects in client libraries, expose a
filesystem presentation of the same S3 data via an NFS mount
backed by EFS. File-layer mutations happen with full filesystem
semantics (in-place writes, rename, append, mmap); the
concepts/stage-and-commit mechanism batches and translates those
mutations back to whole-object PUTs on the S3 side. Warfield's
characterisation of the two worlds, side by side:
"Files are an operating system construct… Application APIs for files are built to support the idea that I can update a record in a database in place, or append data to a log, and that you can concurrently access that file and see my change almost instantaneously, to an arbitrary sub-region of the file."
"Now if we flip over to object world, the idea of writing to the middle of an object while someone else is accessing it is more or less sacrilege. The immutability of objects is an assumption that is cooked into APIs and applications."
The 2026 design lesson: immutability as an object-storage invariant is load-bearing (at-least-once notifications, CRR, log processors, image-transcoding pipelines all depend on whole-object-creation semantics) and must be preserved. File semantics are delivered alongside in a distinct presentation layer — see concepts/boundary-as-feature and concepts/file-vs-object-semantics — rather than by weakening the object invariant.
Block-store flavor — Magic Pocket volumes¶
The same "immutable unit, mutate-by-rewrite" contract shows up one level below the object API in Dropbox's Magic Pocket. The immutable unit there is a volume (a fixed-size container of many blobs), not an object:
- Blobs are never modified in place; updates / deletes write new data.
- Volumes are closed once filled, and never reopened. A volume allocation is not recoverable without rewriting its remaining live blobs into a new volume and retiring the old.
This pushes the mutability burden one layer up: the compaction layer. Two-stage reclamation pipeline (GC marks → compaction frees) is a direct consequence of the immutability invariant holding on volumes. Magic Pocket's multi-strategy compaction (L1 + L2 + L3) over different volume fill-level ranges is the block-store analogue of what Iceberg's managed-compaction does over immutable Parquet files on S3, and what S3-Files' stage-and-commit does over file-level edits. Different data models, same root property.
(Source: sources/2026-04-02-dropbox-magic-pocket-storage-efficiency-compaction)
Seen in¶
- sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes — the immutable-objects primitive as S3's base data model; the motivation for concepts/open-table-format.
- sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3 — most explicit articulation of why immutability is load-bearing (CRR, notifications, downstream pipelines); introduces filesystem- presentation-over-immutable-objects as a second way to bridge the mutability gap, alongside the open-table-format route.
- sources/2026-04-02-dropbox-magic-pocket-storage-efficiency-compaction — block-store-level instance: Magic Pocket volumes are closed once filled and never reopened; deletes accumulate unused space that only compaction can reclaim (two-stage GC-then-compaction pipeline; multi-strategy compaction over the volume fill-level distribution).
- sources/2024-02-15-flyio-globally-distributed-object-storage-with-tigris — Tigris preserves the S3-shape immutable- objects contract (PUT/GET/DELETE/LIST, whole-object replacement) as its front-door API over a very different-shaped backend (regional FoundationDB metadata + NVMe byte-cache tier + QuiCK-style distribution queue + pluggable S3 cold tier). Canonical data point that the immutable-objects contract is a portable API surface — applications coded against S3 port to Tigris unchanged — while the backend architecture changes underneath. Immutability also simplifies demand-driven replication (concepts/demand-driven-replication): a replica is either the full identical object or absent, with no partial-write race to resolve across regions.