Skip to content

CONCEPT Cited by 1 source

In-source CDC checkpointing

Definition

In-source CDC checkpointing is the offset-durability class where a CDC consumer persists its progress position inside the source database itself — not in a separate consumer-managed datastore (Redis / external SQL DB) and not in a server-managed object like a Postgres replication slot.

Canonical verbatim from the Redpanda Connect oracledb_cdc launch:

"Restarts resume from a checkpointed position stored in Oracle itself, with no external cache required, no re-snapshot, and no gaps." (Source: sources/2026-04-09-redpanda-oracle-cdc-now-available-in-redpanda-connect)

Fourth canonical offset-durability class

Across the CDC engines canonicalised on the wiki, the offset- durability axis splits into four distinct shapes:

Engine Offset ownership Storage location
PostgreSQL Server Replication slot (confirmed_flush_lsn)
MySQL Consumer External store (Redis / SQL)
MongoDB Consumer External store (resume token)
Cloud Spanner Consumer Transactional row in source DB
Oracle (Redpanda Connect) Consumer Checkpoint table in source DB

Spanner and Oracle both fit the in-source-resident shape, but they differ on one structural axis: atomicity with the data.

  • Spanner: progress row and data row commit in the same transaction — offset advance and data emission are atomically linked.
  • Oracle (Redpanda Connect): progress stored in a separate checkpoint table the connector maintains — the connector is responsible for advancing the checkpoint only after downstream acknowledgement. Atomicity with the emit is an explicit connector-side obligation, not a transactional property of the source DB schema.

Structural benefits

  • No external cache service to operate. Contrast the MySQL / MongoDB consumer side where Redis or a SQL database must be deployed alongside the CDC pipeline just to hold binlog_position or MongoDB resume tokens. In-source checkpointing collapses the deployment topology to "connector
  • source DB".
  • Backup / restore alignment. Whoever backs up the source DB automatically backs up the CDC checkpoint. A restore to an earlier point in time automatically rewinds the CDC progress — no separate restore of a Redis instance.
  • DBA-visible progress. The checkpoint table is inspectable via standard SQL; operators can confirm CDC liveness without touching the consumer-side store.

Structural trade-offs

  • Write amplification on the source. The source DB takes writes from its own OLTP traffic and from CDC checkpoint advancement. For a lightly-loaded checkpoint (advance-every-N-events) the amplification is negligible; for a continuously-advanced checkpoint it could be material.
  • DB permissions required. The CDC user must have INSERT/UPDATE rights on the checkpoint table. MySQL-style external-offset-store consumers don't need any write permission on the source.
  • Schema conflict risk. The checkpoint-table schema must be stable across connector versions, or operator intervention is required on upgrade. External-offset-store designs sidestep this by owning their schema entirely.

Relationship to Postgres replication slots

Postgres logical replication slots are a server-side offset-durability shape: the server owns the confirmed_flush_lsn and pins WAL until the subscriber advances it. In-source checkpointing is consumer-side: the consumer owns the write cadence and the table, the DB merely hosts the row. The two shapes share the "inside the source DB" location but differ on which party writes the offset:

  • Postgres slot: server writes, consumer acks.
  • Oracle checkpoint table (Redpanda Connect): consumer writes, server stores.

Composition with backup / DR

Because the checkpoint lives in the source DB, standard database backup / DR pipelines automatically include CDC progress. This is a significant operational advantage over external-offset- store designs, where CDC-progress loss is its own disaster scenario distinct from source-DB recovery.

The 2026-04-09 Redpanda post implicitly celebrates this by phrasing the benefit as "no external cache required" — eliminating a whole class of operational surface.

Seen in

Last updated · 470 distilled / 1,213 read