SYSTEM Cited by 1 source

Debezium Engine¶

What it is¶

Debezium Engine is the embedded-library mode of Debezium — the same CDC source connectors, but loaded directly into a host JVM application rather than run under Kafka Connect. The application registers a callback that Debezium Engine invokes for each change event; the application owns the output sink (Kafka, a custom HTTP target, a Lambda-invocation bridge, SQS, direct writes to another database, …).

Contrast:

Debezium (Kafka Connect mode) — default distribution; connectors run inside Kafka Connect workers, events flow to Kafka topics, schemas register with a Kafka Schema Registry.
Debezium Engine — lighter-weight; no Kafka Connect cluster, no implicit Kafka destination; the host app routes events.

Why it shows up on this wiki¶

Debezium Engine is Zalando's chosen deployment shape for its low-code Postgres- sourced event streams platform. Each per-stream micro- application embeds Debezium Engine, subscribes to the upstream Postgres table via pgjdbc's logical- replication API, and forwards events to the stream's configured downstream (Kafka, custom endpoints, AWS Lambda-based transformations). At publication time (2023-11), Zalando was running "hundreds of these Postgres- sourced event streams out in the wild" (Source: sources/2023-11-08-zalando-patching-the-postgresql-jdbc-driver).

Because Debezium Engine reuses the same Debezium Postgres connector code, it inherits the same transitive dependency on pgjdbc, and therefore the same KeepAlive / WAL-advancement behaviour — meaning Zalando's 2023 upstream pgjdbc fix flows through Debezium Engine deployments automatically once they pick up pgjdbc 42.7.0+.

When to choose Debezium Engine over Kafka-Connect Debezium¶

From the Zalando post and the Redpanda 2026-04-09 Oracle CDC framing (sources/2026-04-09-redpanda-oracle-cdc-now-available-in-redpanda-connect):

You don't already run Kafka Connect. The Connect cluster (workers, offset topics, JVM sizing, monitoring) is substantial standalone infrastructure; if your platform is not Connect-based, Debezium Engine lets you skip it.
You need custom routing / custom sinks. Connect's "always lands in a Kafka topic" shape can be inconvenient if the downstream is heterogeneous (Lambda fan-out, SQS, in-memory aggregator). Debezium Engine hands each event to your callback with no Kafka hop required.
You want lightweight per-stream isolation. One Connect worker hosts many connectors; one Debezium Engine instance hosts one connector. If each CDC stream at your company is owned by a separate micro-application (Zalando's platform shape), Engine is the natural fit.

Costs vs Kafka Connect Debezium¶

No out-of-the-box offset durability. Kafka Connect persists connector offsets in a compacted Kafka topic; Debezium Engine expects the host application to persist offsets (file-based or in a database).
Each app owns its own JVM + dependency tree. This is why the pgjdbc-upgrade dance in Zalando's 2023 post was necessary — a fleet of Engine apps each carries its own transitive pgjdbc, so rolling out a pgjdbc fix is a per-app Docker image rebuild.
No built-in schema registry. Host app owns serialization.

Seen in¶

sources/2023-11-08-zalando-patching-the-postgresql-jdbc-driver — canonical wiki introduction of Debezium Engine. Zalando's event-streaming platform runs "a micro application, powered by Debezium Engine" per declared event stream; the post is framed around the pgjdbc fix (PR #2941) that flows through Debezium Engine's transitive pgjdbc dependency to every Engine-based CDC stream. Canonical wiki framing: Debezium Engine as the embedded-mode sibling of Kafka-Connect Debezium, used where the host application wants direct control over the event-processing pipeline.

systems/debezium — the parent project; Debezium Engine is one of its two deployment modes.
systems/kafka-connect — the alternative hosting framework; Debezium Engine is designed for the case where you don't run Connect.
systems/postgresql — the source DB in Zalando's Debezium-Engine deployments.
systems/pgjdbc-postgres-jdbc-driver — the transitive driver dependency; pgjdbc behaviour is load-bearing for the Engine's Postgres source.
systems/zalando-postgres-event-streams — the canonical Debezium Engine deployment shape on this wiki.
concepts/logical-replication — what Debezium Engine's Postgres connector consumes.
concepts/change-data-capture — the category.