Skip to content

CONCEPT Cited by 1 source

Multi-stage convergence

Definition

Multi-stage convergence is a design discipline where a distributed system change (schema update, configuration propagation, data migration) is decomposed into a sequence of bounded phases, each advancing one layer toward the target state, rather than requiring an atomic cross-system transition. The system tolerates temporary divergence between phases within a defined SLA window.

The approach is motivated by the observation that in distributed pipelines, atomic cross-layer changes are either impossible (independent deployment cadences), unaffordable (global stop-the-world), or fragile (two-phase commit across heterogeneous systems). Multi-stage convergence trades atomicity for availability and operational safety.

Pinterest's canonical instance

Pinterest's CDC ingestion platform treats schema evolution as a three-phase convergence process across Kafka, Flink, Spark, and Iceberg:

  • Phase 1 (Schema Divergence): Iceberg table schemas updated; existing jobs write null for new columns
  • Phase 2 (Code Convergence): Spark and Flink code regenerated and deployed
  • Phase 3 (Data Convergence): Historical data backfilled; base table reaches full consistency

"Schema evolution is not treated as an atomic operation. Instead, we treat it as a multi-stage convergence process across the control plane and data plane, which lets us preserve pipeline availability while gradually restoring schema and data correctness." (Source: sources/2026-06-24-pinterest-automated-schema-evolution-in-pinterests-next-generation-db)

Relationship to eventual consistency

Multi-stage convergence is a specialisation of concepts/eventual-consistency — it guarantees convergence to the correct state within a bounded window, but unlike generic eventual consistency it prescribes the ordering and mechanism of each phase transition. It is closer to a deploy-sequencing discipline than a replication protocol.

Seen in

Last updated · 559 distilled / 1,651 read