CONCEPT Cited by 4 sources
Schema registry¶
A schema registry is a centralized, versioned store of data contracts — typically event / message / record schemas — used as the single source of truth for the shape, type, and semantics of data flowing across service boundaries. It turns implicit, per-team contracts into explicit, discoverable, and auditable artifacts.
Core responsibilities¶
- Single source of truth for event/message definitions across teams.
- Versioning — schemas evolve; the registry retains previous versions; publishers and subscribers negotiate which version they produce/consume.
- Validation — publishers can validate outbound events; subscribers can validate inbound events.
- Compatibility enforcement — backward / forward / full rules that gate schema changes at PR / build time (prevent silent contract breaks).
- Deprecation paths — structured removal of fields / events, with lead times communicated to consumers.
- Discovery — a browsable catalog of every event type + its publishers + its subscribers (the "who produces what, who consumes what" map that ad-hoc pub/sub systems famously lack).
- Audit trails — every schema change attributed, reviewed, and reversible.
Registry vs validation — a load-bearing distinction¶
"EventBridge provides developers with tools to implement validation using external solutions or custom application code, it currently does not include native schema validation capabilities." — Amazon Key team, 2026-02-04 (sources/2026-02-04-aws-amazon-key-eventbridge-event-driven-architecture)
A schema registry that stores schemas and a schema validation layer that enforces them are separable capabilities. AWS EventBridge provides the former (+ schema discovery from live traffic) but not the latter; for teams with strict validation requirements this forces a build-on-top choice between a centralized validation service and patterns/client-side-schema-validation. The Amazon Key team explicitly chose client-side validation after evaluating both — the centralized option would have added a network hop + its own scaling problem.
Design axes¶
- Format: JSON Schema (Draft-04 used by Amazon Key's registry), Avro (Confluent Schema Registry canonical), Protobuf, OpenAPI.
- Storage: dedicated microservice vs built into the event-bus control plane.
- Code generation: runtime lookups vs build-time code bindings. Build-time bindings give type-safe event constructors + publish/ subscribe interfaces at the developer ergonomics level.
- Governance model: self-service schema PRs vs gatekept by a central team.
- Integration surface: IDE plugin, CLI, CI hook, runtime library.
Why "loose schemas" is an organisational cost¶
Without a schema registry, event contracts exist only in the consumer code that parses them. Consequences:
- Breaking changes "almost impossible to implement" safely — publishers can't know whether a consumer relies on a field.
- No collaboration surface for schema modifications across teams.
- No place for publishers to discover whether an event is valid before it hits the bus.
- Semantic context (inheritance, composition, required-vs-optional) is lost; every consumer re-infers it.
This is exactly the gap Amazon Key's custom repository was built to close.
Seen in¶
- sources/2026-02-04-aws-amazon-key-eventbridge-event-driven-architecture — Amazon Key built a custom schema repository (JSON Schema Draft-04) alongside EventBridge because native validation is absent. Code bindings generated at build time; client library consumes schemas for pre-publish validation + serde. New-event onboarding time dropped 48h → 4h.
- sources/2025-06-14-netflix-model-once-represent-everywhere-uda — Netflix UDA unifies schema registry and data catalog into a single knowledge-graph substrate. "We needed a data catalog unified with a schema registry, but with a hard requirement for semantic integration." Schemas (GraphQL / Avro / SQL / RDF / Java) are transpiled from upstream domain models (patterns/schema-transpilation-from-domain-model), not hand-authored per surface. Because the upstream domain model is the single source of truth (patterns/model-once-represent-everywhere), the registry gains semantic integration as a property, not just schema-shape agreement.
- sources/2025-06-24-redpanda-why-streaming-is-the-backbone-for-ai-native-data-platforms — Schema registry as CI/CD artefact, not runtime afterthought. Redpanda's backbone essay positions the streaming-context registry as the API contract between teams, equivalent to the HTTP API contract for synchronous services. Verbatim: "Hooking up schema changes and publications as part of your CI/CD pipelines and infrastructure-as-code (IaC) can also help catch issues in your engineering teams earlier during development, rather than in staging or production environments." The implication is that schema evolution becomes a PR-reviewable, IaC-owned artefact rather than an ops-coordinated migration. Complements the existing registry-vs-validation framing on this page by adding a deploy-time layer on top of the registry's storage + the build-time validation hook.
- sources/2026-03-31-redpanda-261-delivers-the-industrys-first-adaptable-streaming-engine — Redpanda 26.1 launch post. Introduces two schema-registry extensions that turn a version-tracking registry into a governance substrate: (1) Schema Registry contexts (concepts/schema-registry-context) — "Contexts allow you to namespace your schemas, making it easy to isolate environments, perform complex migrations, and manage multi-team registries." Three canonical use cases: environment isolation (dev/staging/prod), complex migrations (old-context + new-context co-existing), multi-team registries (one physical registry, many logical namespaces). (2) Custom schema metadata — "You can now attach arbitrary metadata properties to your schemas, turning Redpanda Schema Registry into a first-class citizen in your data governance and observability stack." Annotation axis for owner, SLA, sensitivity classification, lineage tags. Together they extend the registry from pure schema-version-tracking into a queryable data-catalog-adjacent substrate.
Related¶
- concepts/event-driven-architecture — the architectural style that makes schema registries load-bearing (implicit contracts scale badly across many publishers + subscribers).
- systems/amazon-eventbridge — has a schema registry but no native validation.
- patterns/client-side-schema-validation — the pattern that closes the validation gap.
- systems/netflix-uda · systems/netflix-upper — the knowledge-graph-as-registry-plus-catalog wiki instance.
- concepts/knowledge-graph · concepts/domain-model · concepts/semantic-interoperability — the axes along which UDA's schema-registry framing goes beyond classical schema-shape-only registries.
- patterns/model-once-represent-everywhere · patterns/schema-transpilation-from-domain-model — patterns that extend "schema registry" with upstream authoring + downstream projection.