PATTERN Cited by 1 source

Bloom-filter redirect to split partition¶

Definition¶

A Bloom-filter-gated read-path divert that decides on every read whether the partition has been split or not — and only on a Bloom-filter hit incurs the cost of a metadata-table lookup that tells the server where to read the split partition from. The pattern is a transparent routing layer that sits between the read API and the underlying PartitionReader: most reads (no split exists) skip the routing layer entirely; only reads on split partitions incur extra work.

Canonicalised on the wiki by Netflix's TimeSeries Abstraction in the 2026-06-03 dynamic-partition-splitting disclosure (Source: sources/2026-06-03-netflix-dynamically-splitting-wide-partitions-in-cassandra-for-time-series-workloads).

Read path with the pattern¶

TimeSeries server receives read(ts_id, time_range)
  │
  ▼
Bloom-filter check on (ts_id, time_bucket, event_bucket)         ← single-digit microseconds
  │
  ├── miss (probable) → PartitionReader against original table   ← fast path, free routing cost
  │                                                                (also catches false-misses safely
  │                                                                 because Bloom filter has no FN)
  │
  └── hit (possible) → wide_row metadata read-through cache       ← fast on cache hit, O(disk) on miss
                       │
                       ▼
                       Metadata returns:
                         pre_split_data:  what to read
                         post_split_data: where to read it from
                                          + strategy block (target_event_buckets, start_event_bucket)
                       │
                       ▼
                       PartitionReader against split table
                         (same schema → same reader code)
                       │
                       ▼
                       Result returned

Bloom-filter check cost: single-digit microseconds or better — "making this diversion practically invisible to the callers."

Why a Bloom filter (and not e.g. a metadata-table-only lookup)¶

Reads on non-split partitions vastly outnumber reads on split partitions. If the read path always consulted the wide_row metadata table, every read would pay the metadata-lookup cost — an extra round trip on the hot path of a system designed for millions of reads per second.

The Bloom filter inverts the cost:

No split exists → Bloom filter says definitely no → skip metadata lookup entirely. The vast majority of reads.
Split exists → Bloom filter says probably yes → consult metadata. Bounded by the (much smaller) count of split partitions.

Bloom filters never produce false negatives — when the filter says no, the read can confidently skip the metadata lookup. False positives happen, but they cost only the metadata lookup itself, which returns no entry and the read proceeds against the original.

Why an in-memory Bloom filter, loaded periodically¶

The partition-key set of completed splits changes only when new splits complete. The set:

Grows over time as more partitions are split.
Shrinks rarely (only on operational rollback, which re-loads from scratch).
Doesn't change on the read path — only on the splitter completion path.

This shape fits an in-memory snapshot loaded periodically much better than a live stream:

Reads see a consistent snapshot until next reload.
Reload cadence trades freshness against load on the metadata-source.
Memory footprint scales with split-count, not partition-count.

The post asserts: "due to the compactness of partition keys, and ratio of wide partitions in a given dataset, the filters fit comfortably in each server instance." Memory footprint is monitored as an operational concern but does not require rebalancing.

Composition with the metadata table¶

The Bloom filter answers "is this partition split?" — the metadata table answers "what's the split layout?" The pattern composes the two:

Layer	Latency	Purpose
Bloom filter	microseconds	Skip metadata lookup for non-split
Read-through cache (in front of metadata)	microseconds on hit	Skip disk lookup for already-resolved splits
Metadata table	disk-O(read) on cache miss	Authoritative split layout
`PartitionReader` against split table	normal partition read	Returns actual data

Each layer protects the next. The vast majority of reads stop at the Bloom-filter layer. Splits hit the cache. Cache misses hit the metadata table. The metadata table hits the actual split partitions. Every layer is fast on its happy path and the architecture is forward-compatible (more layers can be added without rewriting the read API).

Sibling: Bloom filter as pre-storage gate¶

This pattern is a sibling of patterns/bloom-filter-membership-test-before-storage-fetch (Vercel global routing) — same primitive (Bloom filter as cheap pre-filter to skip an authoritative-but-expensive lookup), different consumer:

Pattern	What's gated	Consequence on hit	Consequence on miss
Bloom-filter pre-storage fetch (Vercel)	Storage fetch	Authoritative storage read	Skip storage entirely
Bloom-filter redirect (this pattern)	Metadata-table lookup	Read split partition via metadata	Read original partition (still authoritative)

The Vercel pattern's miss skips storage; the Netflix pattern's miss reads the original (which is correct — the partition wasn't split). Both leverage the no-FN guarantee of Bloom filters.

Trade-offs¶

Pro	Con
Microsecond routing decision on the hot path	Bloom filter must be loaded into every server's memory
Free for non-split reads (vast majority)	Bloom filter freshness lag — newly-completed splits aren't routed until reload
Composes with read-through cache for split reads	Extra reload coordination required
Memory footprint scales with split count, not data count	False positives still incur metadata lookup
Operational-safety friendly: original-partition-fallback works on FN-impossible filter	Bloom-filter sizing must be planned for max split count

Caveats not disclosed in source¶

False-positive rate, hash count, and bit-array size not specified.
Reload cadence and bandwidth cost not characterised.
What the Bloom filter is keyed on exactly — likely (time_series_id, time_bucket, event_bucket) from the post's read-divert flow, but not explicitly enumerated.

Seen in¶

sources/2026-06-03-netflix-dynamically-splitting-wide-partitions-in-cassandra-for-time-series-workloads — Canonical wiki home. Netflix TimeSeries Abstraction loads partition keys of COMPLETED splits into in-memory Bloom filters periodically; every read consults the filter; on hit, reads wide_row metadata table (read-through cached) for split layout and dispatches to existing PartitionReader against the split table; on miss / fallback, reads original. Bloom-filter check is "single-digit microseconds or better." Filter size monitored: "The filters fit comfortably in each server instance."

concepts/bloom-filter — the primitive.
concepts/dynamic-partition-splitting — the broader concept this pattern is the read-path piece of.
concepts/wide-partition-problem — the failure mode being remediated.
patterns/bloom-filter-membership-test-before-storage-fetch — sibling Bloom-filter routing pattern (Vercel).
patterns/dynamic-partition-split-async-pipeline — the pipeline this pattern is the read-path piece of.
patterns/keep-original-partition-as-fallback-during-split — the operational-safety pattern this composes with.
systems/netflix-timeseries-abstraction — the canonical instance.