Airbnb¶

Airbnb Engineering blog. Tier-2 source on the sysdesign-wiki. Historically strong on marketplace infra, Kubernetes tooling, data platform, and service mesh; recent posts cover dynamic configuration, developer platform, incident tooling, and (2026-05) the OSS 1.0 release of the Viaduct GraphQL multi-tenant runtime.

Key systems¶

systems/airbnb-knowledge-graph-infrastructure — internally managed, multi-tenant knowledge graph platform built on JanusGraph + DynamoDB + OpenSearch; namespace-isolated tenants include identity graph (7B nodes, 11B edges), inventory knowledge graph, fraud detection, and data lineage; management service handles schema enforcement + index lifecycle + Thrift API generation; internal JanusGraph fork with custom transaction strategy (DynamoDB conditional writes), parallel multi-slice fetching, and distributed tracing
systems/viaduct — Airbnb's GraphQL-based "data-oriented service mesh": a multi-tenant runtime that hosts independently developed and tested tenant modules, each owning a portion of the schema. Used internally for years; OSS 1.0 released 2026-05-13 to Maven Central with full @StableApi / @ExperimentalApi / @InternalApi discipline + Kotlin binary compatibility validator in CI + Dokka docs + open RFC community process. The third topology on the wiki for decentralized development of a central GraphQL schema, positioned as complementary to Apollo Federation (Viaduct instances can participate as Federation subgraphs, collapsing per-team server cost while preserving cross-org composition). Load-bearing framing: "Federation distributes development by distributing servers. Viaduct distributes development by distributing modules."
systems/airbnb-skipper — embedded Java/Kotlin workflow engine providing durable execution as a library dependency rather than an external orchestration cluster; uses the host service's existing database (MySQL / UDS / DynamoDB) for workflow state, replays the workflow method on crash while short-circuiting previously completed actions via stored results; 5-annotation programming model (@WorkflowMethod, @StateField, @SignalMethod, @Execute(checkpoint = true), @Compensate); in production >1 year across 15+ teams (insurance, payments, media, infrastructure, incentives, wallet) with peak 10 000 workflows / second on DynamoDB
systems/airbnb-uds — Airbnb's internal Unified Data Store (stub); named as one of Skipper's pluggable persistence backends alongside MySQL
systems/sitar — internal dynamic configuration platform (control plane + data plane + sidecar agent + GitHub-based config workflow)
systems/airbnb-observability-platform — in-house Prometheus/PromQL metrics platform (1,000 services, 300M timeseries, 3,100 dashboards, 300K+ alerts) replacing a vendor stack after a ~5-year migration; OTLP collection + vmagent streaming aggregation at 100M+ samples/sec; reliability plane (2026-05-05): dedicated-but-managed K8s clusters + custom Envoy L7 ingress tier (independent of Istio) + meta-monitoring HA Prometheus–Alertmanager pairs terminated by a dead-man's switch on AWS SNS + CloudWatch
systems/airbnb-metrics-storage — the storage plane under the observability platform: multi-cluster, multi-tenant time-series storage fleet at 50M samples/sec / 1.3B active series / 2.5 PB logical data; tenant-per-application with shuffle sharding on read + write paths; three-zone stateful deploys; multi-cluster federation via custom Promxy with native-histogram support + query-fanout optimization; progressive cluster rollout (test → internal → app → infra) for >99.9% availability
systems/vmagent — VictoriaMetrics agent used as Airbnb's sharded two-tier (router + aggregator) streaming-aggregation tier
systems/himeji — centralized authorization system enforcing access at the data layer; write-time relation denormalization for fast read-time permission checks
systems/airbnb-destination-recommendation — transformer-based sequence model predicting user travel destinations; user actions as tokens (summed city + region + days-to-today embeddings); multi-task region + city heads; serves autosuggest + abandoned-search email notifications

Key patterns / concepts¶

concepts/separate-vs-monolithic-data-models — the core trade-off for multi-product data warehouses: per-product tables vs. unified tables; Airbnb's framework empowers each domain team to choose based on attribute commonality, enforced by three foundational principles (no hybrid models, consistent identifier naming, namespace organization)
concepts/offline-data-warehouse-as-translation-layer — the warehouse transforms raw OLTP data into a standardized analytical source of truth
patterns/foundational-principles-with-decentralized-guidelines — central guardrails + domain-team autonomy as an organizational scaling pattern
concepts/embedded-workflow-engine — Skipper canonicalises the library-in-service shape of durable execution, explicitly rejecting external orchestration clusters (Temporal, Cadence, Step Functions) for Tier 0 services to avoid adding a new critical dependency
concepts/workflow-replay-from-checkpointed-actions — Skipper's durability mechanism; state fields persisted directly, previously completed actions short-circuit on replay via stored results; no event log (trades auditability for leaner execution)
concepts/workflow-determinism-requirement — the correctness invariant Skipper's replay model requires (side effects must live in actions, never in the workflow method)
concepts/workflow-compensation-action — Skipper's @Compensate annotation elevates saga-style compensating actions to a first-class programming primitive with reverse-order orchestration
concepts/workflow-signal — Skipper's @SignalMethod for externally-mutating workflow state while the workflow hibernates on waitUntil { cond }
patterns/workflow-primitives-as-annotated-classes — Skipper's 5-annotation contract (@WorkflowMethod / @StateField / @SignalMethod / @Execute(checkpoint = true) / @Compensate) as a minimal, codegen-free, DSL-free workflow definition surface
patterns/delayed-timeout-task-as-crash-safety-net — Skipper's happy-path-near-zero-overhead mechanism: 2 DB writes at start, batched checkpoints, scheduled timeout task fires harmlessly on completion or triggers replay on crash
patterns/saga-over-long-transaction — the distributed-systems pattern Skipper's @Compensate annotation operationalises
patterns/staged-rollout — first-class platform feature in Sitar (env / zone / pod-%)
patterns/sidecar-agent — per-pod config fetcher with local cache
patterns/git-based-config-workflow — PRs as the default config change path; emergency portal as override
concepts/control-plane-data-plane-separation — explicit "decide" vs "deliver" split in Sitar
concepts/observability — own-the-interaction-layer thesis; vendor pricing / feedback-loop motivations for going in-house
concepts/metric-type-metadata — _otel_metric_type_-driven engine that replaces Prometheus naming-based type inference
patterns/intent-preserving-query-translation — map query intent (e.g., canonical histogram for any p95), not literal queries
patterns/alerts-as-code — Reliability XP alert framework with autocomplete, backtesting, and diffing
patterns/alert-backtesting — replay proposed alerts against historical metric data at PR-diff granularity, with "noisiness" scoring + per-alert inspection; hooks into Prometheus's rule manager
patterns/achievable-target-first-migration — start migration with a tractable, well-aligned service, not the hardest one
concepts/identity-decoupling — User ID vs. per-context Profile IDs as a privacy primitive; different types, not just different values
concepts/least-privileged-access — enforced at the data layer via Himeji, not bolted on per endpoint
patterns/audit-then-refactor-migration — audit scripts → team ownership map → manual review → AI-assisted refactor → type safety, used for the User/Profile ID migration
patterns/dual-write-migration — shared metrics library dual-emits StatsD + OTLP to migrate ~40% of services with one config change
patterns/zero-injection-counter — vmagent tweak that fixes Prometheus rate() undercounting of sparse counters
concepts/streaming-aggregation — in-transit metric aggregation (vmagent routers + aggregators) to collapse per-instance cardinality before storage
concepts/metric-temporality — delta vs. cumulative; Airbnb moved top-cardinality emitters to delta to bound SDK memory
concepts/user-action-as-token — language-modeling framing for recommendation: chronological user actions as transformer tokens; per-action embedding = sum of attribute embeddings (city / region / days-to-today)
patterns/active-dormant-user-training-split — generate N+M training examples per positive outcome — N recent with full history, M dormant with long-term history only — to keep a single model accurate for both recently-active and long-dormant users (Airbnb: 14 examples per booking = 7 active + 7 dormant)
patterns/hierarchical-multitask-geo-prediction — attach multiple prediction heads at different geographic-hierarchy levels (region + city) and train jointly so the encoder learns the taxonomy via auxiliary-task regularization
concepts/shuffle-sharding — randomly-chosen subset of backend nodes per tenant so a bad tenant's blast radius is their K-node shuffle set, not the whole fleet; used on both write and read paths of the metrics storage system
concepts/active-multi-cluster-blast-radius — Airbnb runs dedicated + application clusters in parallel so cluster-scoped failures affect at most 1/N of tenants
concepts/cross-cluster-federated-query-cost — Airbnb measured cross-cluster queries at 5–10× the cost of single-cluster ones; forced adjustments in tenant-consolidation strategy around hot read patterns
patterns/tenant-per-application — reject tenant-per-team (ownership changes frequently) for tenant-per-service (stable, attributable, ready for chargeback); ~1,000 services = ~1,000 tenants
patterns/progressive-cluster-rollout — rollout sequence by criticality (test → internal → application → infrastructure), with infrastructure last so observability remains intact through upstream regressions
patterns/multi-tenant-graphql-runtime — Viaduct's contribution to the wiki: one shared GraphQL runtime hosts many independently developed and tested tenant modules; the third topology for decentralized GraphQL schema development alongside UBFF (one service, one module) and Federation (many services, one module each). Complementary to Federation: Viaduct instances can participate as Federation subgraphs.
patterns/module-based-graphql-decentralization — Viaduct's contribution model: distribute schema development through modules-in-runtime rather than subgraphs-in-services. A module = directory + SDL + resolvers; no per-team server.
patterns/api-stability-annotations — Viaduct 1.0's OSS- readiness discipline: @StableApi / @ExperimentalApi / @InternalApi annotations across all public surfaces + Kotlin's binary compatibility validator in CI.
concepts/data-oriented-service-mesh — Viaduct's self- description; the schema graph (not the network proxy graph) is the mesh substrate. Disambiguates from Envoy/Istio's RPC-oriented service mesh.
concepts/decentralized-development-of-central-schema — the problem space all three GraphQL topologies attack; central schema for discoverability + governance, decentralized development so domain experts can ship.

Recent articles¶

2026-06-09 — sources/2026-06-09-airbnb-scaling-beyond-one-data-architecture (Patrick Lam, Namrata Lamba, Jamie Stober on "Scaling beyond one: How Airbnb evolved its data architecture for a multi-product world" — framework for evolving a decade-old offline data warehouse from single-product (Homes) to three-product (Homes, Experiences, Services). Three foundational principles: no hybrid models, consistent identifier naming, namespace organization. Domain-driven modeling choice: product-facing domains chose separate models (listings, availability, location, guests), cross-cutting domains chose monolithic (messaging, payments, support). Data debt migration via dual-pipeline deprecation.)
2026-05-19 — sources/2026-05-19-airbnb-scaling-identity-graph-unified-knowledge-graph-infrastructure (Lucen Zhao, Shukun Yang, Ashish Jain on "Scaling Airbnb's identity graph with a unified knowledge graph infrastructure" — internal multi-tenant graph platform on JanusGraph + DynamoDB replacing third-party vendor. Identity graph at 7B nodes / 11B edges / 5M edges per day / 4–8 hop queries. Migration delivered 10× write QPS, significant P99 latency reduction, elimination of periodic reboots. Key optimizations: DynamoDB conditional-write transactions, parallel getMultiSlices, client-side Gremlin query rewriting, distributed tracing integration.)
2026-05-13 — sources/2026-05-13-airbnb-viaduct-1-0-and-the-future-of-airbnbs-data-mesh (Ryan Tanner, Raymie Stata, Adam Miskiewicz on "Viaduct 1.0 and the future of Airbnb's data mesh" — OSS 1.0 release of Viaduct, Airbnb's GraphQL-based data-oriented service mesh used internally for years. Architectural contribution: third topology for decentralized development of a central GraphQL schema alongside UBFF (one service, one module) and Apollo Federation (many services, one module each): Viaduct is few runtimes, many tenant modules per runtime. Load-bearing framing quote: "Federation distributes development by distributing servers. Viaduct distributes development by distributing modules." Tenant module contract is intentionally minimal: directory + SDL
resolvers; the platform handles execution / scaling / integration. Complementary to Federation, not alternative: a Viaduct instance can participate as a subgraph in a federated supergraph, so a "large organization where hundreds of teams contribute to the overall graph" can run "a smaller number of Viaduct instances, each hosting many closely related tenant modules" and let Federation compose them — collapsing per-team server cost (factor of M for M modules per instance) while preserving cross-org composition. OSS 1.0 readiness substrate: @StableApi / @ExperimentalApi / @InternalApi annotations across all public surfaces + Kotlin binary compatibility validator in CI + Maven Central publication + Dokka-generated API docs + open community RFC process (first instance: the Connections RFC on GitHub). GraphQLConf 2026 talk teasers signpost forthcoming engineering-retrospective content on multi-tenant gateway observability with built-in ownership tags + cost-aware tracing (Vickey Yeh), gateway sharding for blast-radius reduction (Linquan Zhang & Cetin Sahin), probabilistic correctness testing on Viaduct (James Bellenger), and LLM-driven @generateMock data generation (Michael Rebello) — ingestion candidates when the recordings/write-ups appear.)
2026-05-05 — sources/2026-05-05-airbnb-monitoring-reliably-at-scale (Abdurrahman J. Allawala on "Monitoring reliably at scale" — Airbnb's Observability team breaks circular dependencies in its metrics platform along three axes: (1) dedicated- but-managed Kubernetes clusters for observability workloads (concepts/dedicated-but-managed-infrastructure
patterns/dedicated-observability-kubernetes-clusters) — "just right" middle option between shared-production (couples observability to its targets) and self-run K8s (too much ops burden on the small team); Cloud team still administers; coordinated-change discipline enforced; (2) custom Envoy L7 ingress tier for telemetry, independent of the shared Istio mesh (patterns/custom-l7-proxy-for-telemetry-over-service-mesh), with header-based tenant routing mapping ~1,000 services → cluster backends; motivated by "orders of magnitude more observability traffic than business traffic" + circular dependency of mesh-metrics on the mesh + two-way noisy-neighbour hazard with Airbnb.com traffic; adds an eighth Envoy role on the wiki (telemetry-ingress); extensibility hooks for metric mirroring + fine-grained ACLs; (3) meta-monitoring (concepts/meta-monitoring) — dedicated systems/prometheus + systems/alertmanager HA pairs pinned to nodes/AZs disjoint from the primary stack with pair- level anti-affinity; terminated by a dead-man's switch (patterns/heartbeat-absence-as-alert-trigger) — always- firing alert → SNS → CloudWatch rate alarm on an AWS control plane distinct from the K8s-hosted stack; design bar stated verbatim: "treat monitoring as a production system whose availability must exceed that of what it observes." Compute-vs-networking own-vs-adopt asymmetry articulated explicitly — Kubernetes adopted because the shared foundation fits; networking owned because telemetry's requirements (prioritisation / isolation / custom routing) diverge from what a business-traffic-shaped mesh can cleanly provide.)
2026-04-28 — sources/2026-04-28-airbnb-skipper-building-airbnbs-embedded-workflow-engine (Skipper: embedded Java/Kotlin workflow engine for durable execution; library-in-service shape rather than external orchestration cluster (concepts/embedded-workflow-engine); shares host service's DB for workflow state (MySQL / UDS / DynamoDB); 5-annotation programming model (patterns/workflow-primitives-as-annotated-classes); state-field replay not event history; near-zero happy-path overhead via delayed timeout task; determinism invariant + at-least-once action execution; @Compensate reverse-order walk-back elevates saga compensations to first-class primitive (concepts/workflow-compensation-action); signals via @SignalMethod + durable waitUntil { cond }; in production >1 year across 15+ teams — peak 10 000 workflows / second on DynamoDB; multi-hour Media Foundation video-processing jobs survive pod restarts; Infrastructure team uses it for durable Flink job lifecycle; explicit rejection of external clusters for Tier 0 services to avoid new critical dependencies)
2026-04-21 — sources/2026-04-21-airbnb-building-a-fault-tolerant-metrics-storage-system (storage-plane deep-dive: 50M samples/sec, 1.3B series, 2.5 PB; shuffle-sharding for per-tenant read/write isolation; tenant-per-application for ~1,000 services; single-cluster reliability → multi-cluster federation via custom Promxy; progressive cluster rollout for >99.9% availability; 5–10× federated-query cost tax; Grafana K8s rollout operators replacing multi-day manual deploys; clusters as cattle not pets)
2026-04-16 — sources/2026-04-16-airbnb-statsd-to-otel-metrics-pipeline (StatsD → OTLP migration via shared-library dual-write; two-tier vmagent streaming aggregation at 100M+ samples/sec; delta temporality for top emitters; zero-injection for sparse counters)
2026-04-14 — sources/2026-04-14-airbnb-privacy-first-connections (privacy-first identity model for social Experiences: User ID ↔ many context-scoped Profile IDs, Himeji authorization with write-time relation denormalization, AI-assisted audit+refactor migration)
2026-03-17 — sources/2026-03-17-airbnb-observability-ownership-migration (5-year vendor → in-house Prometheus/PromQL metrics migration; intent-preserving translation, metadata engine, alerts-as-code, own the interaction layer)
2026-03-04 — sources/2026-03-04-airbnb-alert-backtesting-change-reports (deep-dive on the Reliability XP alert-authoring platform: local-first dev + Change Reports + bulk alert backtesting hooking Prometheus's rules/manager.go; per-backtest K8s pod isolation; 300K alerts migrated, 90% alert-noise reduction, month → afternoon iteration cycle)
2026-03-12 — sources/2026-03-12-airbnb-destination-recommendation-transformer (transformer-based destination recommendation model; user actions as tokens with summed city + region + days-to-today embeddings; 14 training examples per booking = 7 active + 7 dormant to balance short-term and long-term intent; multi-task region + city heads to inject geolocation hierarchy; autosuggest + abandoned-search email applications, A/B wins in non-English-primary regions)
2026-02-18 — sources/2026-02-18-airbnb-sitar-dynamic-configuration (Sitar: dynamic config platform architecture)