Skip to content

SYSTEM Cited by 1 source

Kafka Schema Registry

Kafka Schema Registry is a separate service in the Kafka ecosystem that stores the schemas of records flowing through Kafka topics and enforces compatibility rules when producers register new schema versions. The canonical serialisation format in registry-integrated pipelines is Avro (though other serialisers exist); producers push both data and schema updates to the registry, and consumers fetch schemas by ID to deserialise.

Stub page — expand on future Schema-Registry-internals sources. The canonical wiki use case is Datadog's managed multi-tenant CDC replication platform, where the Schema Registry is the runtime half of the schema-evolution-safety answer (see patterns/schema-registry-backward-compat).

Compatibility modes (the surface that matters for CDC)

Schema Registry supports several compatibility modes that govern whether a proposed new schema is accepted or rejected against the stored one:

  • Backward — new schemas must still allow older consumers to read data without errors. In practice this limits schema changes to adding optional fields or removing existing fields. This is the mode Datadog uses.
  • Forward — older schemas must still allow newer consumers to read data.
  • Full — both directions.
  • None — no checking.

Datadog's choice, in their own words:

"We've configured it for backward compatibility, which means new schemas must still allow older consumers to read data without errors. In practice, this limits schema changes to safe operations — like adding optional fields or removing existing ones." (Source: sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform)

Role in the Datadog CDC platform

When a schema migration happens on a source Postgres:

  1. Debezium captures the updated schema as part of the CDC stream.
  2. Debezium serialises data into Avro format.
  3. Debezium pushes both the data and the schema update to the relevant Kafka topic + Schema Registry.
  4. The registry compares the new schema against the stored schema under the configured compatibility mode (backward at Datadog).
  5. Accept or reject.

The design intent is explicitly that the registry protects both in-platform consumers and external custom consumers:

"Since users can also build custom Kafka consumers to directly read the topics, maintaining schema compatibility is especially important — we want to ensure that all consumers, whether internal or external, continue to work without disruption." (Source: sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform)

Multi-tenancy

Datadog's deployment is explicitly a multi-tenant Kafka Schema Registry, integrated with both source and sink connectors across teams' pipelines. The post doesn't break down the multi-tenancy mechanism (per-namespace schema groups, ACLs, …), so this is noted but not detailed here.

Composition with pre-deploy validation

Schema Registry is one of two layers in Datadog's schema-evolution answer:

Layer Mechanism What it catches
Pre-deploy (offline) Automated schema management validation analysing migration SQL Structural-breaking changes like SET NOT NULL on a column that in-flight messages might not populate
Runtime (online) Kafka Schema Registry in backward-compat mode; Avro-serialised data + schema update pushed together Schema-incompatible updates rejected at registry

Defence in depth: offline analysis catches the change before it hits production; runtime Schema Registry catches the residual class that slip through.

Seen in

Last updated · 200 distilled / 1,178 read