Skip to content

CONCEPT Cited by 1 source

Data debt migration

Definition

Data debt is the accumulation of legacy tables, dashboards, and pipelines that no longer meet current modeling standards but remain in use because of downstream dependencies. Introducing new standards highlights this debt by making it visible against the new baseline.

Migrating and deprecating these assets requires extreme care: tables often have hundreds of downstream consumers, so the process involves extensive communication, dual pipeline runs for validation, and a slow deprecation cycle to avoid breaking dependent business processes (Source: sources/2026-06-09-airbnb-scaling-beyond-one-data-architecture).

Key characteristics

  • Scale of impact — a single legacy table can have hundreds of downstream consumers
  • Requires dual runs — old and new pipelines run simultaneously to validate correctness
  • Slow deprecation — painstakingly careful decommissioning to avoid breaking business
  • Ongoing effort — Airbnb reports continued refinement months after initial launch
  • Not optional — leaving debt in place risks data silos, inconsistent analytics, and future innovation slowdown

Seen in

Last updated · 542 distilled / 1,571 read