SYSTEM Cited by 1 source

Netflix Casspactor¶

Casspactor was Netflix's legacy Cassandra-to-Iceberg data-movement engine, processing ~1,200 data movements per day and transferring approximately 3 PB of data from Apache Cassandra into Apache Iceberg tables (Source: sources/2026-06-19-netflix-the-evolution-of-cassandra-data-movement-at-netflix).

Architecture¶

Casspactor assembled a composite view of backups from multiple independent systems — each with its own failure modes, update cadences, and accuracy guarantees. It required all nodes in a region to snapshot at the same clock second; a single node replacement could break data movement for an entire region.

Limitations that drove replacement¶

Fragile metadata dependencies — metadata fell out of sync with actual backups, causing silent stale/incorrect reads.
Skewed partition failures — could not handle tables with large partitions (common in KV and Time Series workloads), crashing with OOM errors.
No data-model awareness — moved raw Cassandra tables as-is; abstractions bolted on post-processing.
Intermediate table bloat — wrote to intermediate Iceberg table; higher-level connectors added more intermediates, compounding storage cost.
No time travel — unable to restore prior backups after topology or schema changes.
Monolithic design — built as a single connector, not as an engine for a family of connectors.

Replacement¶

Replaced by the Cassandra Analytics Wrapper + Move Data connector architecture via the Decider Pattern implemented in Maestro. See sources/2026-06-19-netflix-the-evolution-of-cassandra-data-movement-at-netflix for the full migration story.

Seen in¶

sources/2026-06-19-netflix-the-evolution-of-cassandra-data-movement-at-netflix — full history, architecture, limitations, and replacement.