Skip to content

SYSTEM Cited by 1 source

Debezium Engine

What it is

Debezium Engine is the embedded-library mode of Debezium — the same CDC source connectors, but loaded directly into a host JVM application rather than run under Kafka Connect. The application registers a callback that Debezium Engine invokes for each change event; the application owns the output sink (Kafka, a custom HTTP target, a Lambda-invocation bridge, SQS, direct writes to another database, …).

Contrast:

  • Debezium (Kafka Connect mode) — default distribution; connectors run inside Kafka Connect workers, events flow to Kafka topics, schemas register with a Kafka Schema Registry.
  • Debezium Engine — lighter-weight; no Kafka Connect cluster, no implicit Kafka destination; the host app routes events.

Why it shows up on this wiki

Debezium Engine is Zalando's chosen deployment shape for its low-code Postgres- sourced event streams platform. Each per-stream micro- application embeds Debezium Engine, subscribes to the upstream Postgres table via pgjdbc's logical- replication API, and forwards events to the stream's configured downstream (Kafka, custom endpoints, AWS Lambda-based transformations). At publication time (2023-11), Zalando was running "hundreds of these Postgres- sourced event streams out in the wild" (Source: sources/2023-11-08-zalando-patching-the-postgresql-jdbc-driver).

Because Debezium Engine reuses the same Debezium Postgres connector code, it inherits the same transitive dependency on pgjdbc, and therefore the same KeepAlive / WAL-advancement behaviour — meaning Zalando's 2023 upstream pgjdbc fix flows through Debezium Engine deployments automatically once they pick up pgjdbc 42.7.0+.

When to choose Debezium Engine over Kafka-Connect Debezium

From the Zalando post and the Redpanda 2026-04-09 Oracle CDC framing (sources/2026-04-09-redpanda-oracle-cdc-now-available-in-redpanda-connect):

  • You don't already run Kafka Connect. The Connect cluster (workers, offset topics, JVM sizing, monitoring) is substantial standalone infrastructure; if your platform is not Connect-based, Debezium Engine lets you skip it.
  • You need custom routing / custom sinks. Connect's "always lands in a Kafka topic" shape can be inconvenient if the downstream is heterogeneous (Lambda fan-out, SQS, in-memory aggregator). Debezium Engine hands each event to your callback with no Kafka hop required.
  • You want lightweight per-stream isolation. One Connect worker hosts many connectors; one Debezium Engine instance hosts one connector. If each CDC stream at your company is owned by a separate micro-application (Zalando's platform shape), Engine is the natural fit.

Costs vs Kafka Connect Debezium

  • No out-of-the-box offset durability. Kafka Connect persists connector offsets in a compacted Kafka topic; Debezium Engine expects the host application to persist offsets (file-based or in a database).
  • Each app owns its own JVM + dependency tree. This is why the pgjdbc-upgrade dance in Zalando's 2023 post was necessary — a fleet of Engine apps each carries its own transitive pgjdbc, so rolling out a pgjdbc fix is a per-app Docker image rebuild.
  • No built-in schema registry. Host app owns serialization.

Seen in

  • sources/2023-11-08-zalando-patching-the-postgresql-jdbc-drivercanonical wiki introduction of Debezium Engine. Zalando's event-streaming platform runs "a micro application, powered by Debezium Engine" per declared event stream; the post is framed around the pgjdbc fix (PR #2941) that flows through Debezium Engine's transitive pgjdbc dependency to every Engine-based CDC stream. Canonical wiki framing: Debezium Engine as the embedded-mode sibling of Kafka-Connect Debezium, used where the host application wants direct control over the event-processing pipeline.
Last updated · 501 distilled / 1,218 read