A reading room for system design
How the hardest systems are actually built.
A curated archive of engineering writing from Netflix, Meta, AWS, Cloudflare, Stripe, and others — distilled into concepts, patterns, and systems you can actually cite.
Start here
Control Plane / Data Plane Separation
The single most-cited pattern in the corpus: keep the fast path independent from the thing that configures it.
Observability
Logs, metrics, traces — and the unresolved debate about what belongs in each.
Write-Ahead Logging
How databases, message queues, and filesystems all survive crashes the same way.
Eventual Consistency
What you actually get when you scale a read-heavy system past one region.
Backpressure
Unbounded queues are the most common production bug. This is how systems push back.
Tail Latency at Scale
Why the p99.9 matters more than the mean, and the hedged-request playbook for taming it.
Blast Radius
The discipline of designing so a single failure doesn't take everything with it.
Latest additions
- 2024-02-15Fly.io — Globally Distributed Object Storage with TigrisFLYIO
- 2025-01-17Scaling Large Language Models for e-Commerce: The Development of a Llama-Based Customized LLM
- 2026-04-17Shared Dictionaries: compression that keeps up with the agentic web
- 2026-04-21Figma — Server-side sandboxing — Containers and seccompFIGMA
- 2026-04-17Cloudflare — Agents that remember: introducing Agent Memory
- 2026-04-21Cloudflare — Moving past bots vs. humansCLOUDFLARE
- 2026-02-11Google Research — Scheduling in a changing world: Maximizing throughput with time-varying capacityGOOGLE
- 2025-10-30Toward provably private insights into AI use (Google Research, 2025-10-30)
- 2024-08-15We're Cutting L40S Prices In HalfFLYIO
- 2025-02-14We Were Wrong About GPUsFLYIO
- 2026-04-16Building the foundation for running extra-large language models
- 2024-03-07Fly.io — Fly Kubernetes does more now (FKS beta)FLYIO
Browse
Concepts — 568
The primitives: CAP, WAL, consistent hashing, backpressure, leader election.
Patterns — 405
Repeatable solutions: circuit breaker, sidecar, saga, event sourcing, outbox.
Systems — 523
Named systems in production: Dynamo, Kafka, Spanner, Borg, Colossus.
Sources — 178
Every ingested article with full metadata, tags, and backlinks.
How this works¶
An ingestion pipeline pulls engineering blog posts from ~30 company feeds, de-duplicates them, and runs each through a curator agent that distills the piece into structured notes and cross-links it against the existing concept graph. See the overview for the full methodology and corpus stats.
The LLM-readable catalog mirrors every page for agent consumption.