Skip to content

CONCEPT Cited by 1 source

Immutable partition

Definition

An immutable partition is a Cassandra (or wide-column-store) partition that has provably stopped receiving new writes — typically because the partition's identifying time range has passed an acceptLimit-style cutoff, the partition's parent Time Slice has rolled over, or the namespace's retention model declares the partition closed. "Immutable" here is operational, not structural: the partition's row set is now fixed, even if the schema permits writes.

Immutability is a load-bearing precondition for safe in-place split / migration / archival of the partition's contents. Splitting or moving a mutable partition is "inherently more complex" because new writes can arrive during the migration window, requiring concurrency-control machinery (write locks, dual-writes, conflict resolution) that splitting a sealed partition does not need.

Canonicalised on the wiki by Netflix's TimeSeries Abstraction team in the 2026-06-03 dynamic partition splitting disclosure (Source: sources/2026-06-03-netflix-dynamically-splitting-wide-partitions-in-cassandra-for-time-series-workloads).

Why it matters: surface-area reduction

The Netflix TimeSeries team's design choice is explicit:

"Although splitting mutable partitions is possible, it is inherently more complex. As a first step towards solving this problem, we chose to reduce the surface area of this change by focusing on immutable partitions, while still meaningfully reducing caller timeouts."

The same engineering discipline that drives reduce-surface-area elsewhere applies here: start by handling the structurally simpler subset of partitions that doesn't require concurrency control, ship the simpler version, and defer the harder case to follow-up work.

In TimeSeries' shape:

  • The dataset is divided into Time Slices (e.g. data_20260328).
  • Each slice has an acceptLimit (typically 5 seconds) — events with timestamps too far in the past are rejected.
  • Once a slice's time window plus acceptLimit has elapsed, the slice is provably never receiving more writes — it is immutable.
  • The TimeSeries server can compute this immutability at runtime without coordination — it's a function of now() and the slice's known boundaries.

The detection event for dynamic partition splitting carries an immutable: true flag computed at detection time:

{
  "time_slice": "data_20260328",
  "time_series_id": "profileId:123",
  "time_bucket": 7,
  "event_bucket": 2,
  "immutable": true,
  "version": "0"          // reserved to invalidate if partition is no longer immutable
}

The version field is reserved for the future case where mutable partitions are also split — at that point a partition that becomes mutable again (e.g. backfill, late-arriving event, schema change) needs a way for the server to invalidate cached split metadata.

How immutability is established

Three structural sources:

  1. Time-bounded slice with acceptLimit — Netflix TimeSeries' approach. A slice is immutable when now() > slice.end + acceptLimit.
  2. Sealed-on-rollover — common in append-only / log-structured systems. A partition is sealed when the next partition opens; subsequent writes route elsewhere.
  3. Archival promotion — partition is migrated to immutable cold storage (object store) where mutation is impossible.

Cassandra itself does not have a native concept of an immutable partition; immutability in Cassandra is layered on top by the application, typically by routing-layer or DAL convention (refusing writes for keys whose time-bucket has passed).

What immutability buys you

Operation Mutable partition Immutable partition
Split into smaller partitions Need write-side dual-write + conflict resolution + cutover coordination Single-pass copy + checksum validation + read-path divert
Migrate to archive Write-side replay required Single-pass copy
Compute exact size / row count Snapshot-only or eventually-consistent Authoritative
Pre/post checksum validation Mismatches expected from concurrent writes Mismatch ⇒ correctness bug
Repair / anti-entropy Active during repair window Repair runs once, terminates

The Netflix TimeSeries split pipeline relies on immutability to make pre/post checksums meaningful: a mutable partition could fail checksum validation simply because new writes arrived between pre-checksum and post-checksum, hiding genuine correctness bugs.

Sibling concepts

Concept Domain Same shape
concepts/immutable-aggregation-window Stream processing Time window provably won't receive late events
concepts/immutable-segment-file LSM-tree storage SSTable file provably won't be mutated
concepts/immutable-object-storage Object storage Object provably won't be overwritten
concepts/immutable-index-state Search index Index segment provably sealed

The shared discipline: once a thing is provably no longer changing, all the operations that needed concurrency control simplify dramatically. Immutability is not just a property — it's a simplification mechanism for downstream operations.

Seen in

  • sources/2026-06-03-netflix-dynamically-splitting-wide-partitions-in-cassandra-for-time-series-workloadsFirst wiki canonicalisation as a load-bearing precondition for in-place partition splitting. TimeSeries server computes immutability at detection time using acceptLimit-derived time bounds; emits immutable: true flag in the Kafka detection event; reserves version: "0" field for future invalidation when mutable-partition splits ship. The team explicitly defers mutable-partition splits as future work — "There is more work planned around this feature, like splitting mutable wide partitions, or re-processing previously failed splits."
Last updated · 542 distilled / 1,571 read