CONCEPT Cited by 1 source
MongoDB change streams¶
Definition¶
MongoDB change streams are MongoDB's native CDC mechanism exposing a cursor over the replica set's operations log (oplog) — an append-only collection of all data-mutating operations committed across the primary. Clients open a change stream on a database, a collection, or the whole cluster, and receive change events (insert / update / delete with configurable document payload) in commit order.
Change streams were introduced in MongoDB 3.6 and are the official consumer-facing API on top of the oplog; consumers should not tail the oplog directly.
Why it matters for CDC¶
Change streams are the substrate under every MongoDB CDC
connector in the wiki: Debezium's MongoDB
connector and
Redpanda Connect's mongodb_cdc
both consume this API. The 2025-03 Redpanda post canonicalises
the operational shape verbatim:
"Captures updates directly from MongoDB's operations log, providing an efficient, near-real-time data stream."
Plus two structurally-distinctive properties:
- Parallelised initial snapshots. "The connector employs parallel reads during snapshots, significantly boosting performance for large-scale data migrations by splitting collections into manageable chunks." Canonicalised as parallel snapshot.
- Flexible document modes. "Customizable document handling for updates and deletes, supporting full-document lookups and pre/post image capture." MongoDB change streams can emit just the change delta, the full post-update document, or pre+post images — the consumer chooses.
Offset-durability shape¶
MongoDB change streams identify events by resume tokens (opaque values keyed to an oplog timestamp + event identity). Unlike Postgres's server-owned replication slot, MongoDB does not persist consumer progress server-side — the consumer must persist its own resume token. Redpanda Connect's MongoDB CDC connector therefore requires an external offset store: "Uses external stores for oplog positions, similar to MySQL, giving you control over your checkpointing strategy."
Oplog retention coupling¶
The oplog is a capped collection — finite on disk, with a retention window driven by write rate. Any change-stream consumer whose resume token falls behind the oplog's retention horizon can no longer catch up from the stream and must restart from a fresh snapshot. This is the same failure mode as MySQL's finite binlog retention, and the structural reason snapshot-plus-catchup pipelines interleave row-copy with change-log catch-up rather than running them sequentially.
Seen in¶
- sources/2025-03-18-redpanda-3-powerful-connectors-for-real-time-change-data-capture — canonical wiki introduction. MongoDB change streams named as one of four engine-native CDC substrates Redpanda Connect surfaces. Parallel snapshot of large collections + flexible document modes + external offset store canonicalised verbatim.
Related¶
- concepts/change-data-capture
- concepts/external-offset-store — MongoDB puts offset durability in consumer-managed stores, like MySQL and unlike Postgres.
- concepts/parallel-snapshot-cdc — MongoDB connector in Redpanda Connect ships this.
- systems/mongodb-server — the engine.
- systems/redpanda-connect —
mongodb_cdcconsumer. - systems/debezium — Debezium MongoDB connector, the ecosystem alternative.
- patterns/snapshot-plus-catchup-replication — the two-phase pattern this mechanism participates in.