Skip to content

CONCEPT Cited by 1 source

Schema divergence window

Definition

A schema divergence window is the time period during which different components of a distributed system operate under different schema versions simultaneously. During this window, data may be written with null values for new columns (by components not yet updated) or may lack fields that the storage layer now expects.

The window is bounded when a system uses multi-stage convergence — each phase transition narrows the divergence until full consistency is restored.

Pinterest's instance

In Pinterest's CDC pipeline, after the Iceberg schema is updated (Phase 1), existing Flink and Spark jobs continue running because:

  • Generated code selects columns by name, not position
  • Iceberg treats newly added columns as nullable with a default of null

This produces a divergence window where new columns exist in storage but are written as null until Phases 2 and 3 deploy updated code and backfill historical data. (Source: sources/2026-06-24-pinterest-automated-schema-evolution-in-pinterests-next-generation-db)

Seen in

Last updated · 559 distilled / 1,651 read