SYSTEM Cited by 1 source
Debezium Engine¶
What it is¶
Debezium Engine is the embedded-library mode of Debezium — the same CDC source connectors, but loaded directly into a host JVM application rather than run under Kafka Connect. The application registers a callback that Debezium Engine invokes for each change event; the application owns the output sink (Kafka, a custom HTTP target, a Lambda-invocation bridge, SQS, direct writes to another database, …).
Contrast:
- Debezium (Kafka Connect mode) — default distribution; connectors run inside Kafka Connect workers, events flow to Kafka topics, schemas register with a Kafka Schema Registry.
- Debezium Engine — lighter-weight; no Kafka Connect cluster, no implicit Kafka destination; the host app routes events.
Why it shows up on this wiki¶
Debezium Engine is Zalando's chosen deployment shape for its low-code Postgres- sourced event streams platform. Each per-stream micro- application embeds Debezium Engine, subscribes to the upstream Postgres table via pgjdbc's logical- replication API, and forwards events to the stream's configured downstream (Kafka, custom endpoints, AWS Lambda-based transformations). At publication time (2023-11), Zalando was running "hundreds of these Postgres- sourced event streams out in the wild" (Source: sources/2023-11-08-zalando-patching-the-postgresql-jdbc-driver).
Because Debezium Engine reuses the same Debezium Postgres connector code, it inherits the same transitive dependency on pgjdbc, and therefore the same KeepAlive / WAL-advancement behaviour — meaning Zalando's 2023 upstream pgjdbc fix flows through Debezium Engine deployments automatically once they pick up pgjdbc 42.7.0+.
When to choose Debezium Engine over Kafka-Connect Debezium¶
From the Zalando post and the Redpanda 2026-04-09 Oracle CDC framing (sources/2026-04-09-redpanda-oracle-cdc-now-available-in-redpanda-connect):
- You don't already run Kafka Connect. The Connect cluster (workers, offset topics, JVM sizing, monitoring) is substantial standalone infrastructure; if your platform is not Connect-based, Debezium Engine lets you skip it.
- You need custom routing / custom sinks. Connect's "always lands in a Kafka topic" shape can be inconvenient if the downstream is heterogeneous (Lambda fan-out, SQS, in-memory aggregator). Debezium Engine hands each event to your callback with no Kafka hop required.
- You want lightweight per-stream isolation. One Connect worker hosts many connectors; one Debezium Engine instance hosts one connector. If each CDC stream at your company is owned by a separate micro-application (Zalando's platform shape), Engine is the natural fit.
Costs vs Kafka Connect Debezium¶
- No out-of-the-box offset durability. Kafka Connect persists connector offsets in a compacted Kafka topic; Debezium Engine expects the host application to persist offsets (file-based or in a database).
- Each app owns its own JVM + dependency tree. This is why the pgjdbc-upgrade dance in Zalando's 2023 post was necessary — a fleet of Engine apps each carries its own transitive pgjdbc, so rolling out a pgjdbc fix is a per-app Docker image rebuild.
- No built-in schema registry. Host app owns serialization.
Seen in¶
- sources/2023-11-08-zalando-patching-the-postgresql-jdbc-driver — canonical wiki introduction of Debezium Engine. Zalando's event-streaming platform runs "a micro application, powered by Debezium Engine" per declared event stream; the post is framed around the pgjdbc fix (PR #2941) that flows through Debezium Engine's transitive pgjdbc dependency to every Engine-based CDC stream. Canonical wiki framing: Debezium Engine as the embedded-mode sibling of Kafka-Connect Debezium, used where the host application wants direct control over the event-processing pipeline.
Related¶
- systems/debezium — the parent project; Debezium Engine is one of its two deployment modes.
- systems/kafka-connect — the alternative hosting framework; Debezium Engine is designed for the case where you don't run Connect.
- systems/postgresql — the source DB in Zalando's Debezium-Engine deployments.
- systems/pgjdbc-postgres-jdbc-driver — the transitive driver dependency; pgjdbc behaviour is load-bearing for the Engine's Postgres source.
- systems/zalando-postgres-event-streams — the canonical Debezium Engine deployment shape on this wiki.
- concepts/logical-replication — what Debezium Engine's Postgres connector consumes.
- concepts/change-data-capture — the category.