System Design — Overview¶

Synthesis across the wiki corpus. Snapshot (2026-06-18T03:00): 536 sources / 40 companies / 1,684 systems / 2,861 concepts / 1,771 patterns. No new ingests since 2026-06-13 synth (5 days). 9 skips on 2026-06-17 (all product announcements, marketing, or duplicate re-fetches). Wiki stable; corpus growth paused for Data + AI Summit noise.

What changed since 2026-06-13 synth¶

Quiet window. RSS poller fetched 9 new articles on 2026-06-17 but all were filtered out:

1× Cloudflare product announcement (Cloudflare One agent toolkit — no architecture)
5× Databricks Data + AI Summit marketing/product posts (dashboards, partner frameworks, security roundup, ML engineering agents, ecosystem pitch)
1× Google Earth AI research (geospatial ML, no serving-infra)
2× Netflix duplicate re-fetches (Human Infrastructure, State of Routing — both already ingested)

Last substantive batch (2026-06-11 → 2026-06-13) — 8 ingests across 6 companies:

Airbnb: Scaling beyond one data architecture (T2). Data modeling framework for multi-product evolution: foundational principles (no hybrids, consistent naming, clear namespaces) + decentralized domain choice. New: 6 concepts (separate-vs-monolithic-data-models etc), 5 patterns. 14 pages touched.
Atlassian: Architecting Scalable ML Platforms (T2). ML Studio: composable workflow modules, deterministic task caching, hot-cluster reuse, column-level access control. New: systems/atlassian-ml-studio, 7 concepts, 6 patterns. 16 pages touched.
Databricks: AI Serving Platform That Adapts to Your Model (T3, architectural). AutoPilot Pod Autoscaler: two-axis horizontal+vertical autoscaling, warm node pools, model-aware concurrency tuning. 300K+ QPS with no customer-tuning knobs. New: 2 systems, 4 concepts, 3 patterns. 14 pages touched.
Lyft: Metric Semantic Layer (T2). Metrics-as-code: YAML + Jinja → SQL generation, dual-owner governance, MCP integration for AI agents. Key insight: only "Golden Metrics" with ≥2 consumers qualify. New: systems/lyft-metric-semantic-layer, 4 concepts, 4 patterns. 14 pages touched.
Cloudflare: Scaling Security Insights (T1). 10× scanning throughput via five architecture-only fixes (no infra adds): batch-parallel Kafka consumption, fast/slow consumer split, hybrid bulk INSERT, active-passive API collocation, adaptive rate-limited scheduling. New: systems/cloudflare-security-insights, 3 concepts, 3 patterns. 15 pages touched.
Databricks: Zerobus Ingest — Petabyte-Scale (T3, deep). Zerobus architecture: stream-connection-level ordering (not partition-level), zero-copy protobuf at ~1 GB/s/core, WAL-before-lakehouse-publish. 12 GB/s sustained / 1 PB in 24h to single Delta table. New: 2 concepts, 4 patterns. 13 pages touched.
Dropbox: MCP + Dash for design-to-code security (T2). Only 12% of PRs link back to threat models. MCP-as-context-bridge achieves 80% linkage via semantic search. Advisory-over-blocking principle. 11 pages touched.
Databricks: Evolutionary DB Dev Part 3 (T3, conceptual). Team-scale database branching: tier topology as long-running branches, SCM state machine with blocking gates, agents-as-junior-developers, DBA→platform engineer. Neon reports ~500K branches/day, 80%+ agent-created. New: systems/lakebase-app-dev-kit, 3 concepts, 5 patterns. 16 pages touched.

Corpus shape¶

Metric	Count
Source pages	536
Company pages	40
System pages	1,684
Concept pages	2,861
Pattern pages	1,771
Total wiki pages	~6,892

By company (top 20, source count)¶

Company	Sources	Tier
companies/cloudflare	58	T1
companies/databricks	47	T3
companies/redpanda	38	T3
companies/meta	31	T1
companies/aws	31	T1
companies/netflix	30	T1
companies/flyio	30	T3
companies/planetscale	28	T3
companies/figma	23	T2
companies/zalando	20	T2
companies/google	19	T1
companies/pinterest	15	T2
companies/mongodb	14	T3
companies/slack	13	T2
companies/instacart	13	T2
companies/yelp	12	T3
companies/vercel	12	T3
companies/dropbox	12	T2
companies/airbnb	12	T2
companies/github	11	T2
companies/datadog	10	T2

Publication timeline¶

~61% of sources published in 2026 (325), ~26% in 2025 (140), ~11% in 2024 (57). Remainder are canonical older posts (Figma multiplayer 2019, Zalando SRE 2021–2023). Ingestion rate: 144 sources in April 2026 (backfill), settling to ~12–15/month steady-state. June 2026 trending quiet: 23 sources in first 13 days, then pause due to summit noise filtering.

Under-sampled: Uber (HTML scraper pending), LinkedIn (stub), Apple / ByteDance / Microsoft Engineering.

Recurring architectural themes (by citation density)¶

Tier A — pervasive (40+ source references or 200+ inbound links)¶

Blast radius containment — 18+ source refs, 283 inbound links. Cell architecture, staged rollouts, fault-domain isolation, incremental validation. Every Tier-1 company writes about it. Key: concepts/blast-radius, patterns/staged-rollout, patterns/incremental-blast-radius-validation.
Control-plane / data-plane separation — 22 source refs, 160 inbound links. Extended by "control plane as the new data plane" under agentic workloads. Key: concepts/control-plane-data-plane-separation, systems/vitess, systems/lakebase.
LLM-as-judge — 15 source refs, 200 inbound links. Dominant offline-eval pattern 2025–2026. Meta BVT, Zalando search quality, Instacart relevance, Cloudflare code review, Netflix synopses. Key: concepts/llm-as-judge.
MCP / agent-native infrastructure — 31+ source tags, 289 inbound links. Fastest-growing theme. This window adds Dropbox MCP-for-security, Lyft MCP-for-metrics-agents. Key: systems/model-context-protocol, patterns/wrap-cli-as-mcp-server, patterns/specialized-agent-decomposition, patterns/mcp-as-context-bridge.

Tier B — structural (15–40 source refs)¶

Change data capture — 17 source tags, 199 inbound links. Redpanda Connect, Debezium, Kafka Connect, Delta CDF, Oracle CDC. The plumbing connecting OLTP → analytics → lakehouse. Key: concepts/change-data-capture, systems/debezium, systems/redpanda-connect.
Observability — 41 source tags, 192 inbound links. Shifting from push-to-TSDB to lakehouse-resident telemetry. OTel becoming universal. Key: concepts/observability, systems/opentelemetry, patterns/telemetry-to-lakehouse.
Compute-storage separation — 16 source refs, 125 inbound links. The defining storage architecture: Lakebase, Snowflake, Neon, PlanetScale, Redpanda Cloud Topics. Key: concepts/compute-storage-separation.
Horizontal sharding — appears in 162 sources (by content). PlanetScale/Vitess consensus series, DynamoDB, Netflix wide-partition splits. Key: concepts/horizontal-sharding, systems/vitess, concepts/wide-partition-problem.
Schema evolution / database branching — 13 source refs (growing). Vitess online DDL, PlanetScale deploy requests, Lakebase three-part series, Iceberg schema evolution. Database branching moving from dev convenience → production substrate. Key: concepts/schema-evolution, concepts/database-branching, concepts/evolutionary-database-design.
Autoscaling as system design — 12+ sources. Databricks two-axis autoscaler, Netflix container mount, Cloudflare adaptive rate-limited scheduling, Redpanda elastic partitioning. The theme: autoscaling is architectural, not operational. Key: concepts/cold-start, patterns/asymmetric-aggressive-up-conservative-down-autoscaling, patterns/two-axis-horizontal-plus-vertical-autoscaling.

Tier C — emergent / fast-growing (5–15 source refs)¶

AI/LLM serving at scale — 26 LLM + 22 agents tags. Databricks 300K+ QPS inference (new this window), Slack multi-cloud routing, Netflix model-serving, Cloudflare Workers AI. Key: concepts/cold-start, concepts/context-engineering, patterns/multi-cloud-llm-serving, systems/databricks-model-serving.
Durable execution — 8 source refs, 149 inbound links on Cloudflare Durable Objects alone. Implementations: embedded (Temporal), external (Step Functions), workflow-as-code (Cloudflare), DB-backed (Maestro). Key: concepts/durable-execution, systems/cloudflare-durable-objects.
Post-quantum cryptography — 5 sources. Cloudflare (IPsec ML-KEM GA, TLS 1.3 PQ), Meta (migration framework), Google (quantum vulnerability disclosure). Key: concepts/post-quantum-cryptography.
Generative retrieval — 3 sources (Instacart TIGER, Meta SilverTorch, Instacart ads). Autoregressive token generation replacing two-tower + ANN scoring. Key: concepts/generative-retrieval, systems/silvertorch.
Defense-in-depth / zero-trust — 29 security tags. Cloudflare customer-zero, Yelp zero-trust access, GitHub eBPF deployment safety, Dropbox MCP-for-security (new). Key: concepts/defense-in-depth, concepts/positive-security-model.
Metrics-as-code / semantic layer — 5 sources (new theme this window). Lyft MSL, Databricks BI Serving, Airbnb data architecture, Pinterest PiQaMa. Governance shifting left into versioned config. Key: concepts/headless-bi-semantic-layer, patterns/yaml-config-driven-metric-definitions.

Most-cited systems (by inbound wiki-links, top 20)¶

System	Links	Primary source
systems/vitess	456	PlanetScale
systems/mysql	432	(ubiquitous)
systems/aws-s3	335	AWS
systems/model-context-protocol	289	Anthropic/ecosystem
systems/planetscale	274	PlanetScale
systems/kafka	265	(ubiquitous)
systems/redpanda	246	Redpanda
systems/postgresql	239	(ubiquitous)
systems/unity-catalog	235	Databricks
systems/cloudflare-workers	234	Cloudflare
systems/apache-iceberg	221	(ecosystem)
systems/kubernetes	211	(ubiquitous)
systems/lakebase	210	Databricks
systems/apache-spark	199	Databricks
systems/delta-lake	175	Databricks
systems/innodb	174	MySQL/Oracle
systems/dynamodb	164	AWS
systems/aws-lambda	161	AWS
systems/fly-machines	157	Fly.io
systems/cloudflare-durable-objects	149	Cloudflare

Most-cited concepts (top 20)¶

Concept	Source refs	Links
concepts/blast-radius	18	283
concepts/llm-as-judge	15	200
concepts/change-data-capture	14	199
concepts/observability	21	192
concepts/control-plane-data-plane-separation	22	160
concepts/horizontal-sharding	8	129
concepts/compute-storage-separation	16	125
concepts/defense-in-depth	10	28
concepts/scale-to-zero	11	28
concepts/tail-latency-at-scale	8	25
concepts/cold-start	11	22
concepts/durable-execution	8	20
concepts/context-engineering	12	19
concepts/post-quantum-cryptography	5	18
concepts/vector-similarity-search	11	17
concepts/database-branching	6	17
concepts/schema-evolution	10	16
concepts/tenant-isolation	9	16
concepts/medallion-architecture	7	16
concepts/evolutionary-database-design	4	15

Most-cited patterns (top 15)¶

Pattern	Links	Description
patterns/upstream-the-fix	46	Fix at source, not downstream
patterns/specialized-agent-decomposition	27	Decompose agent into specialist sub-agents
patterns/wrap-cli-as-mcp-server	22	Expose CLI tools via MCP
patterns/disposable-vm-for-agentic-loop	17	Sandbox agent in ephemeral VMs
patterns/ai-gateway-provider-abstraction	17	Unified gateway across LLM providers
patterns/tool-surface-minimization	17	Fewer tools = more reliable agents
patterns/staged-rollout	15	Progressive deploy with rollback gates
patterns/measurement-driven-micro-optimization	13	Profile → optimize → measure loop
patterns/cheap-approximator-with-expensive-fallback	13	Fast heuristic + slow precise backup
patterns/ltx-compaction	13	LTX file compaction strategy
patterns/partner-managed-service-as-native-binding	13	Third-party as first-class binding
patterns/streaming-broker-as-lakehouse-bronze-sink	11	Stream → lakehouse bronze layer
patterns/fast-rollback	11	Instant revert in deployments
patterns/dynamic-partition-split-async-pipeline	10	Netflix wide-partition auto-split
patterns/mcp-as-context-bridge	10	MCP for cross-system context retrieval

Trade-offs and contradictions¶

Documented tensions across sources¶

Monolith ↔ microservice. Meta SilverTorch collapses retrieval microservices → unified PyTorch model. Airbnb expands from monolithic vendor → distributed graph. Rule: compute-bound consolidates; IO-bound + multi-team distributes.
Generative retrieval vs two-tower + ANN. Instacart chose generative for large catalogs + cold-start. Meta/Pinterest retain two-tower for real-time personalization. Split on: latency tolerance × catalog size × cold-start severity.
Partition-level vs stream-level ordering. Kafka: partition = ordering unit (scaling requires rebalance). Zerobus: stream-connection = ordering unit (dynamic scaling). Trade-off: Kafka's design is simpler but inflexible; Zerobus enables elastic autoscaling but requires custom client semantics.
Database branching: copy-on-write vs schema-only. Lakebase uses CoW forks (instant, data-included). PlanetScale deploy requests branch schema only. Neon CoW with ephemeral compute. Different DX goals (testing vs migration safety vs full-environment parity).
Agent governance: centralized vs per-tool. Unity Catalog (centralized ACL) vs Cloudflare (per-tool Zod schema + egress policy) vs Lakebase (SCM state machine with blocking gates). Spectrum, not binary — all converging on "agents-as-junior-developers" with structural constraints.
Separate vs monolithic data models. Airbnb's explicit framework: product teams with unique attributes → separate; cross-cutting services → monolithic. Neither universally superior.
Signature-matching vs ML-scoring for security. Cloudflare: ML anomaly scoring over WAF signatures for frontier-model threats. Dropbox: LLM-based semantic gap detection between design and code. Both moving from pattern-match to reasoning-based security.
Active-active vs active-passive API. Cloudflare discovered active-active API with single-region DB primary causes cross-region latency → pool exhaustion. Active-passive (collocate API with primary) wins for write-heavy workloads.

Trends (confidence-weighted)¶

High confidence (5+ sources, clear trajectory)¶

MCP becoming de-facto agent-tool protocol. 31+ sources. Cloudflare, MongoDB, Redpanda, Databricks, Pinterest, Fly.io, Dropbox, Lyft all publishing MCP integrations. Use cases expanding: from CLI tooling → security context → metrics governance.
Observability data → lakehouse. Databricks OTel + Unity Catalog, Airbnb statsd→OTel, Yelp S3 access logs, Pinterest PerfView. Cost-driven: TSDB retention expensive; Iceberg/Delta cheap at petabyte scale.
AI code review / AI-in-CI standard. Cloudflare (coordinator + sub-reviewers), Meta (BVT), Zalando (AI-as-judge quality gates), Atlassian Rovo Dev, Dropbox MCP security review.
Security shifting to architecture-over-patching. Cloudflare customer-zero, positive-security-model, continuous red-team. Dropbox: advisory LLM-based gap detection. Assumption: AI-assisted attackers make signature-based defense insufficient.
Database branching as substrate, not feature. Lakebase 3-part series, Neon (~500K branches/day), PlanetScale deploy requests. Databases becoming "branchable by default" for agents and CI. Key: concepts/evolutionary-database-design, patterns/per-developer-database-branch-paired-with-code-branch.

Medium confidence (3–4 sources, emerging)¶

Generative retrieval → production. Instacart (TIGER + ads) + Meta (SilverTorch). Both shift from "score candidates" to "generate candidate IDs."
Bare-metal fleet ops as system design. Cloudflare boot-time optimization (hours→minutes) + Meta PowerLoss Storm + Redpanda Cloud Topics metastore. Firmware/UEFI/iPXE is load-bearing.
Contract-driven multi-agent coordination. OmniNode topic-naming, Atlassian Jira triggers, MongoDB MCP registry, Lakebase artifact-as-API. Inter-agent channel naming/schema as dominant failure mode.
Metrics-as-code with dual governance. Lyft MSL, Pinterest PiQaMa, Databricks BI Serving. Pattern: YAML config + template SQL + dual-owner approval + MCP exposure.
Zero-copy high-throughput ingestion. Zerobus (~1 GB/s/core via Rust zero-copy protobuf), Redpanda (zero-copy Kafka), Cloudflare scanning (batch-parallel Kafka). Memory allocation is the enemy of throughput.

Low confidence (2 sources, watch)¶

Region-scale chaos engineering. Only Meta (PowerLoss Storm). Netflix operates at AZ level. Requires enormous maturity.
Post-quantum at scale. Only Cloudflare + Meta have published migration frameworks. Most companies haven't started.
Raft log ≡ LSM WAL unification. Redpanda Cloud Topics metastore. Elegant but only one implementation so far.
Stream-connection-level ordering replacing partition-level. Only Zerobus so far. Kafka's partition model still dominates.

Language / runtime observations¶

From source tags: Rust (15+ sources — Cloudflare Pingora, Meta WhatsApp, Aurora DSQL, Figma memory, Zerobus zero-copy), TypeScript (13 — Vercel, Cloudflare Workers, Zalando), Go (10 — Fly.io, Instacart serving, Datadog agent, Cloudflare scanning), Kotlin/Java (5 — Meta Kotlinator, Netflix JDK Vector API, Slack), Python (Lyft MSL, Atlassian ML Studio). Rust adoption concentrated at hot-path / memory-safety boundary. Go dominates network-intensive microservices. TypeScript dominates edge/serverless.

Open questions¶

Zerobus ordering guarantees under failure. Stream-connection-level ordering during failover/rebalance — docs mention graceful drain but not crash recovery semantics.
Lakebase branching production adoption. Neon discloses 500K branches/day; Lakebase production-scale numbers remain undisclosed.
Lyft MSL scale metrics. How many Golden Metrics? How many consumers? Cost of Python package approach vs service approach as org scales?
Cloudflare Security Insights cross-DC replication. Active-passive solved the problem; what happens when primary fails over?
Dropbox MCP security coverage. 80% linkage via semantic search — what's the false positive rate on gap detection?
Netflix dynamic partition split cross-DC. Whether split metadata replicates cross-DC or is region-local.
Meta PowerLoss Storm frequency. How often region-scale tests run undisclosed.

Fundamentals → concepts/horizontal-sharding, concepts/compute-storage-separation, concepts/change-data-capture, concepts/eventual-consistency, concepts/blast-radius.
Storage deep-dive → systems/vitess, systems/lakebase, systems/apache-iceberg, systems/liquid-clustering, concepts/wide-partition-problem, patterns/expand-and-contract-schema-migration.
LLM serving → concepts/cold-start, concepts/scale-to-zero, patterns/multi-cloud-llm-serving, systems/databricks-model-serving, patterns/two-axis-horizontal-plus-vertical-autoscaling.
Agent infra → systems/model-context-protocol, systems/cloudflare-agents-sdk, patterns/specialized-agent-decomposition, patterns/wrap-cli-as-mcp-server, patterns/mcp-as-context-bridge, concepts/context-engineering.
Data architecture → concepts/separate-vs-monolithic-data-models, concepts/headless-bi-semantic-layer, patterns/domain-driven-data-modeling-choice, concepts/data-lakehouse.
Observability → concepts/observability, systems/opentelemetry, systems/prometheus, patterns/telemetry-to-lakehouse, concepts/observability-stack-partial-dependency.
Security / crypto → concepts/defense-in-depth, concepts/post-quantum-cryptography, concepts/positive-security-model, patterns/require-access-before-reachability, patterns/mcp-as-context-bridge.
Streaming / CDC → systems/redpanda, systems/kafka, concepts/change-data-capture, systems/zerobus-ingest, systems/redpanda-cloud-topics, patterns/streaming-broker-as-lakehouse-bronze-sink.
Retrieval / RecSys → concepts/generative-retrieval, concepts/two-tower-architecture, systems/silvertorch, systems/instacart-generative-ads-retrieval, concepts/semantic-id.
Graph at scale → concepts/knowledge-graph, systems/meta-tao, systems/janusgraph, systems/netflix-graph-abstraction, concepts/identity-graph.
Reliability → concepts/chaos-engineering, concepts/instantaneous-power-loss, patterns/staged-rollout, patterns/fast-rollback, concepts/bootstrapping-circular-dependency.
Database ops → concepts/database-branching, concepts/evolutionary-database-design, systems/lakebase, systems/planetscale, patterns/per-developer-database-branch-paired-with-code-branch, patterns/scm-workflow-state-machine.
Autoscaling → patterns/asymmetric-aggressive-up-conservative-down-autoscaling, patterns/two-axis-horizontal-plus-vertical-autoscaling, concepts/cold-start, systems/databricks-autopilot-pod-autoscaler.
Audit → wiki/index.md, wiki/log.md, wiki/analyses/.