PATTERN Cited by 1 source
Debezium + Kafka Connect CDC pipeline¶
Summary¶
The canonical open-source shape for change-data-capture at scale:
Source DB (Postgres / MySQL / MongoDB / Cassandra / ...)
│ replication log (WAL / binlog / oplog / commit log)
▼
Debezium source connector (runs on Kafka Connect)
│ Avro-serialised keyed records + schema updates
▼
Kafka topic + Kafka Schema Registry (compat gate)
│
▼
Kafka Connect sink connector
│ + Single-Message Transforms (per-tenant shape tweaks)
▼
Destination (Elasticsearch / Postgres / Iceberg / Cassandra /
another Kafka / ...)
This is the substrate pattern under most CDC platforms discussed in the wiki, and is the transport backbone of Datadog's managed multi-tenant replication platform.
Why this shape¶
Each component earns its place:
- Debezium — handles the database- specific mechanics of tailing the replication log and emitting row-level change events (insert / update / delete with primary-key + column values). Users don't hand-roll a Postgres logical-decoding consumer or a Cassandra commit-log reader.
- Kafka — provides the durable, ordered, replayable, partition-scalable middle that lets source producers and sink consumers progress at independent rates. Kafka's partition model carries CDC keys natively.
- Kafka Connect — hosts both Debezium (source) and the sink connectors (Elasticsearch, JDBC, Iceberg, …) under a uniform lifecycle + offset-tracking
- rebalancing framework. Exposes single-message transforms as the per-pipeline customisation point.
- Kafka Schema Registry — the runtime gate against schema-incompatible updates. Debezium serialises records and schema changes to Avro and publishes to the registry; the registry enforces a compatibility mode (backward, typically) across source + sink.
The composition makes each half — producer side and consumer side — independently operable, upgradable, and scalable.
Operational prerequisites (Postgres side)¶
The 7-step runbook Datadog named for a Postgres source pipeline:
- Enable logical replication (
wal_level=logical). - Create + configure Postgres users with replication permissions.
- Establish replication publications (what to replicate) and slots (per-consumer progress cursors).
- Deploy Debezium source connectors on Kafka Connect mapped to those slots.
- Create Kafka topics with correct partitioning; map Debezium instances to topics.
- Add heartbeat tables so slots advance LSN during quiet periods (otherwise WAL pins indefinitely and the primary's disk fills).
- Configure sink connectors into the downstream system.
Each step is a standalone failure mode if skipped. Datadog's Temporal- orchestrated automation wraps this runbook as the provisioning-layer pattern.
What this pattern pairs with¶
- patterns/schema-registry-backward-compat — the runtime schema-compat rule the registry enforces.
- patterns/schema-validation-before-deploy — the offline gate that rejects migration SQL that would break in-flight records.
- patterns/connector-transformations-plus-enrichment-api — the per-tenant customisation shape applied on top of the Kafka Connect sink.
- patterns/workflow-orchestrated-pipeline-provisioning — automation around the 7-step runbook.
- patterns/managed-replication-platform — the full platform shape Datadog built on top of this backbone.
Failure modes the pattern explicitly admits¶
- Pinned WAL: slow or disconnected consumers pin the Postgres replication slot and prevent WAL recycling; heartbeat tables and aggressive slot-monitoring are the standard fix.
- Schema drift without registry protection: a source-side DDL that changes record shape in a compat-breaking way propagates to downstream consumers. The registry-in- backward-compat pattern is the canonical answer.
- Connector-specific transformation limits: where the OSS single-message transforms don't cover a tenant's need, the operator maintains custom forks (Datadog does this for Datadog-specific logic and optimisations).
- Async-only consistency: this pipeline is, by construction, asynchronous. Any workload requiring same-transaction source+destination visibility needs a different pattern.
Seen in¶
- sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform — core transport backbone of Datadog's managed multi-tenant CDC platform. Postgres-to-Elasticsearch seed pipeline ran with ~500 ms replication lag and delivered page-load latency improvements up to 97%. The same backbone generalised across Postgres-to-Postgres, Postgres-to-Iceberg, Cassandra-to-X, and cross-region Kafka replication.
Related¶
- concepts/change-data-capture — the concept this pattern realises.
- concepts/logical-replication — Postgres-specific source technology.
- concepts/asynchronous-replication — the consistency posture.
- systems/debezium, systems/kafka, systems/kafka-connect, systems/kafka-schema-registry, systems/postgresql — component systems.
- patterns/managed-replication-platform — how Datadog wraps this backbone into a tenant-facing product.