CONCEPT Cited by 1 source
Spanner change stream¶
Definition¶
Spanner change streams are Google Cloud Spanner's native CDC mechanism: a named change stream is declared against a database, table set, or column set, and emits change records (insert / update / delete with before/after values) partitioned to track Spanner's underlying horizontally-sharded storage layer.
Consumers read change records from one or more partitions of the change stream; Spanner manages partition topology automatically as the underlying table's sharding evolves.
Distinguishing operational properties¶
Three properties canonicalised verbatim from Redpanda Connect's
gcp_spanner_cdc connector documentation in the 2025-03-18
post:
- Transactional progress storage. "Stores progress transactionally in a configurable spanner table for at least once delivery." The consumer's change-stream offset is persisted in a Spanner table — a transactional row — rather than in a server-side catalog object (Postgres replication slot) or an external cache (MySQL / MongoDB). Progress and data share the same transactional substrate, which is only viable because Spanner's transactional commit spans arbitrary rows cluster-wide.
- Dynamic partition split/merge handling. "Automatically processes partitions that are merged and split, avoiding hotspots." Spanner can split a partition into two (when one shard becomes hot) or merge two partitions into one (when they're cold), and the CDC consumer must reconcile partition identity across these topology changes without duplicating reads or missing records. This is the structural distinguishing feature vs static-partition-topology CDC in Postgres / MySQL / MongoDB — where the change log's topology is stable over time.
- Configurable retention + filters. "Custom change data retention, table modification filters and value types." Operator-controlled retention window for change records; filters on which table modifications to emit (specific columns, specific operations).
Comparison with other CDC substrates¶
Structural three-way split on offset-durability shape across the four CDC-capable engines Redpanda Connect supports:
| Engine | Offset storage | Topology |
|---|---|---|
| PostgreSQL | Server-owned replication slot | Static |
| MySQL | External offset store (Redis / SQL) | Static |
| MongoDB | External offset store (resume token) | Static |
| Spanner | Transactional row in source Spanner | Dynamic (splits/merges) |
Spanner is the only engine where the change stream's partition topology changes over time. This forces the consumer to handle "partition finished, its successor is X and Y" or "partitions A and B merged into C" events as first-class signals, not edge cases. Coordination mechanism not disclosed in the Redpanda-post framing.
Seen in¶
- sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture
— canonical wiki introduction. Spanner change streams named
as one of four engine-native CDC substrates Redpanda Connect
surfaces via the
gcp_spanner_cdcconnector. Three distinguishing properties (transactional progress storage, dynamic partition split/merge, configurable retention) canonicalised verbatim.
Related¶
- concepts/change-data-capture
- systems/cloud-spanner
- systems/redpanda-connect —
gcp_spanner_cdcconsumer. - patterns/snapshot-plus-catchup-replication — the two-phase pattern CDC shapes participate in.