Skip to content

CONCEPT Cited by 1 source

Two-level map KV model

The two-level map KV model is a key-value data shape of the form

HashMap<String, SortedMap<Bytes, Bytes>>

where the first level is a hashed string id (the partition key) and the second level is a sorted map of byte-keys to byte-values. One structure covers flat maps, named sets, structured records, and time-ordered events — which is why it's attractive as the uniform surface of an abstraction layer over multiple storage engines.

The shape

  • Level 1id (string, hashed). Partitioning unit: all items under one id live together on a single replica set.
  • Level 2SortedMap<Bytes, Bytes>. Ordered by key; enables efficient range scans, prefix lookups, and "n newest by key" deletes.

The canonical Netflix statement:

"At its core, the KV abstraction is built around a two-level map architecture. The first level is a hashed string ID (the primary key), and the second level is a sorted map of a key-value pair of bytes." (Source: sources/2024-09-19-netflix-netflixs-key-value-data-abstraction-layer)

Why this one shape covers many use cases

  • Flat KV mapid → {"" → value} (one item, empty key).
  • Named Setid → {key → ""} (only keys matter).
  • Structured Record — fields encoded as keys, values as the field payloads.
  • Time-ordered Event log — timestamp-prefixed keys in the sorted map; range scans = time-range queries.
  • Graphs / adjacency lists — node-id as id, neighbor-ids as keys.

All of these get the same partitioning / replication / pagination / compression / chunking semantics under one abstraction.

Cassandra mapping

The two-level map maps directly onto Cassandra's native model:

  • id → partition key.
  • key → clustering column (ordered).
  • value → blob.

Direct DDL from the Netflix post:

CREATE TABLE IF NOT EXISTS <ns>.<table> (
  id             text,
  key            blob,
  value          blob,
  value_metadata blob,
  PRIMARY KEY (id, key))
WITH CLUSTERING ORDER BY (key <ASC|DESC>)

Netflix's KV DAL leverages this same shape on other backends (EVCache, DynamoDB, RocksDB) so namespace configs can swap engines without callers changing their data model.

Trade-offs

  • Wide-partition risk — since everything under one id is on one replica set, adversarial use (unbounded items per id) produces a Cassandra "wide partition" that starves one node. KV DAL addresses this with chunking for large single items, and explicit concern around wide partitions as a known anti-pattern.
  • Fat-column risk — a single huge value under one key can blow out memory on reads. Same chunking mechanism applies.
  • Not a relational model — joins, transactions across ids, and secondary indexes are out-of-scope; the KV DAL explicitly trades these for simpler scaling and cache-friendliness.

Seen in

Last updated · 319 distilled / 1,201 read