Concepts¶

Distributed-systems concepts: consistency models, CAP, consensus, backpressure, CRDTs, etc.

568 pages

Most-cited¶

Control plane / data plane separation 16 sources — Architectural split between the "decide" path (control plane: validation,
Observability 13 sources — The function of providing visibility into application performance and
Vector Embedding 9 sources — A vector embedding is a dense numerical representation of a piece
Vector Similarity Search 8 sources — Vector similarity search is the retrieval primitive behind
Compute–storage separation 7 sources — Compute–storage separation is the architectural property where a
Lightweight formal verification 7 sources — Lightweight formal verification is a family of techniques that sit between ad-hoc testing and heavyweight proof-based formal methods (TLA+, …
LLM as Judge 7 sources — LLM-as-judge is the evaluation pattern in which one LLM scores
Agent context window 6 sources — The agent context window is the fixed-size LLM working set into
Blast radius 6 sources — Blast radius is the scope of damage that a single fault — bug,
Defense in Depth 6 sources — Defense in depth is the security posture of stacking
Hot path 6 sources — The hot path is the code that runs on every (or
Shared Responsibility Model 6 sources — The Shared Responsibility Model is AWS's contract-level framing

All pages (A–Z)¶

3D parallelism — 3D parallelism is the standard shape for training billion-plus-parameter language models at multi-node GPU-cluster scale: compose [[concepts…
Abstract Syntax Tree — An Abstract Syntax Tree (AST) is an intermediate representation
Account-per-tenant isolation — Account-per-tenant isolation is the shape of [[concepts/tenant-isolation|
Active multi-cluster blast radius — Active multi-cluster blast radius is the reliability property
Actor model — The actor model is a concurrency + distributed-systems
Address-based agent routing — Address-based agent routing is the pattern of encoding the target
Agent context window — The agent context window is the fixed-size LLM working set into
Agent-ergonomic CLI — An agent-ergonomic CLI is a command-line interface
Agent-first storage primitive — Agent-first storage primitive names the design posture of building
Agent memory — Agent memory is an AI agent's accumulated, searchable context across turns and sessions — the things this agent (or this agent for this user…
Agent Readiness Score — Agent Readiness Score is a Lighthouse-style rubric proposed by
Agent Skills discovery — Agent Skills discovery is the pattern of publishing a
Agent-training-crawler redirect — An agent-training-crawler redirect is the practice of
Agentic Data Access — Agentic data access is the design concern that as applications
Agentic development loop — The agentic development loop is a closed-loop LLM code
Agentic development — Agentic development is a development model where an AI agent
Agentic Paywall — Agentic paywall is the design posture where content access is
Agentic traffic share — Agentic traffic share is the fraction of HTTP traffic
Agentic troubleshooting loop — An agentic troubleshooting loop is an investigation pattern in
Aggregate demand smoothing (multi-tenant scale) — Aggregate demand smoothing names the observation that, in a system shared by a very large number of independent tenants, the sum of bursty p…
Aggregation pipeline — Aggregation pipeline is MongoDB's
AI re-review (incremental) — Incremental AI re-review is the discipline of re-running an AI code-review agent with full awareness of its own prior findings and the human…
AI thinking heartbeat — AI thinking heartbeat is the operational discipline of emitting a periodic "the model is still working, here is how long it's been since any…
Aleatoric uncertainty — Aleatoric uncertainty is the component of prediction
Alert fatigue — The operator-side failure mode in which a notification channel emits
Always-on ambient sensing — Always-on ambient sensing is the serving-infra envelope where
Anonymous credential — An anonymous credential is a cryptographic primitive that lets
Anycast — Anycast is a network-layer routing discipline in which the
API Catalog — API Catalog is an IETF standard
Appropriate Technology — Appropriate technology is Werner Vogels' framing for choosing the
Architectural Isolation — Architectural isolation is the multi-tenant security posture
Async clone + background hydration — Async clone + background hydration is the
Async iteration — Async iteration is the JavaScript language protocol for
Asynchronous replication — Asynchronous replication is a replication posture in which the
Asynchronous reply email — Asynchronous reply email is the property of the email channel that
At-most-once delivery — At-most-once delivery is a messaging / RPC delivery semantic
Attribute-based access control (ABAC) — Attribute-based access control (ABAC) decides whether a principal
Authorization decision caching — Authorization decision caching stores the Allow / Deny outcome of
Autoformalization — Autoformalization is the problem of translating natural-language specifications (policies, procedures, guidelines, manuals, informal rules) …
Automated reasoning — Automated reasoning is the family of techniques that mechanically prove (or disprove) properties of systems against a formal specification —…
Automatic dependency tracking — Automatic dependency tracking is the mechanism where a reactive
Availability Zone balance — Availability Zone (AZ) balance is the property of a workload
AWS partition — An AWS partition is a logically isolated group of AWS Regions with
Back-dirty (feedback-loop invalidation) — A back-dirty is a feedback-loop pattern in a multi-system reactive
Backpressure — Backpressure is the control-plane primitive by which a
Backward compatibility — Backward compatibility is the property that interfaces accept
Benchmark methodology bias — A benchmark methodology bias is a confounder built into a
BF16 exponent redundancy — BF16 exponent redundancy is the empirical observation that
BGP route withdrawal — BGP route withdrawal is the act of unadvertising a prefix from
Big-O over key length — Standard algorithmic analysis talks about Big-O over the
Bin-packing — Bin-packing is the combinatorial-optimisation problem of
Binary authorization — Binary authorization is the security control that restricts a
Binary-size bloat — Binary-size bloat is the monotonic growth of a compiled
Bisimulation — Bisimulation is the two-way equality of behaviour-sets between two
Bitpacking — Bitpacking is the practice of combining multiple sub-byte
Blast radius — Blast radius is the scope of damage that a single fault — bug,
Block-level async clone — Block-level async clone is the storage-migration primitive
Bolt-vs-SQLite storage choice — The design decision between a key-value embedded store
Boolean query DSL — A boolean query DSL is a structured query-document shape in
Bootstrap percentile method — The bootstrap percentile method is a non-parametric technique for
Bot-vs-human frame — The bot-vs-human frame is the assumption that the important
Boundary as Feature — A design principle crystallised in the 2026 S3 Files post: when two
Break-glass escape hatch — Break-glass escape hatch is an explicit, telemetry-tracked mechanism for a human to override an automated gate when the automation is wrong,…
BSON document overhead — BSON document overhead is the per-document + per-field bytes that
Bucket Pattern — Bucket Pattern — MongoDB's named [schema-design pattern](https://www.mongodb.com/docs/manual/data-modeling/design-patterns/group-data/bucket…
Build graph — A build graph is the DAG of declared build/test actions and their
Bump-in-the-wire middlebox — A bump-in-the-wire middlebox is a network device — firewall,
Bundler chunk invalidation — Bundler chunk invalidation is the phenomenon where a
Bus-hop storage tradeoff — The bus-hop storage tradeoff is the architectural bargain a
BYOB reads (bring-your-own-buffer) — BYOB (bring-your-own-buffer) reads are the
BYOK (Bring Your Own Key) — BYOK (Bring-Your-Own-Key) is the posture in which a customer stores
C++ compilation model (pre-processing & transitive includes) — C++ compilation happens per translation unit (TU) — normally one .cpp
Cache locality — Cache locality is the property that requests for the same key
Cache TTL staleness dilemma — The cache TTL staleness dilemma is the forced either/or that
Cache-variant explosion — Cache-variant explosion is the
Canonical tag — The canonical tag is an HTML element — <link
XML canonicalisation — XML canonicalisation (canonical XML; C14N) is the process of
Capability-based sandbox — A capability-based sandbox is a code-execution environment
Cascades (LLM inference) — Cascades (in the LLM-inference-serving sense) is a
Catastrophic forgetting — Catastrophic forgetting (or "catastrophic interference") is the failure mode in which a neural network, while being trained on a new task or…
Central-first sharded architecture — Central-first sharded architecture pairs a global coordinator with
Centralized AI governance — Centralized AI governance is the pattern-concept of routing
Centralized network inspection — Centralized network inspection is the architectural shape
Change Data Capture (CDC) — Change Data Capture (CDC) is the practice of materialising a
Circular dependency (deployment context) — A circular dependency in the deployment context is the
Clean-room recovery account — A clean-room recovery account is a separate AWS account (or
CLI convention enforcement — CLI convention enforcement means baking naming and flag
Client-server model — The client-server model is the foundational deployment pattern
Client-side load balancing — Client-side load balancing means the caller chooses which backend instance to send a request to, rather than delegating that decision to a p…
Cluster health check — A cluster health check is a liveness / readiness probe that a
Customer-Managed Keys (CMK) — Customer-Managed Keys (CMK) is the encryption-key ownership model
Coding-agent sprawl — Coding-agent sprawl is the named problem class of an
Cold Start — Cold start is the extra latency a serverless / scale-to-zero service
Columnar storage format — A columnar storage format lays out a tabular dataset on disk by
Commit Sequence Number (CSN) — A Commit Sequence Number (CSN) is a monotonically-increasing integer stamped onto a transaction at commit time, used as the canonical positi…
Commit signing — Commit signing is the cryptographic attachment of a signature to a
Common-deadline scheduling — Common-deadline scheduling is the problem-class
Competitive ratio — Competitive ratio is the canonical quality metric for an
Compositional specification — A compositional (or modular) specification describes a system
Compression side-channel attack — A compression side-channel attack exploits the fact that
Compute shader — A compute shader is a GPU program that performs general
Compute–storage separation — Compute–storage separation is the architectural property where a
Computed Pattern — Computed Pattern — MongoDB's named schema-design pattern
Conflict-free Replicated Data Type (CRDT) — A Conflict-free Replicated Data Type (CRDT) is a data structure
Conformance checking — Conformance checking is the problem of mechanically verifying that a
Congestion window — The congestion window (often cwnd) is the transport-
Connection multiplexing — Connection multiplexing is the architectural pattern of
Connection time — Connection time is the wall-clock time for an end-user
Container ephemerality — Container ephemerality is the operational-semantics property that a
Container escape — A container escape is when code inside a container breaks
Content-addressed caching (of build/test actions) — Content-addressed caching stores action outputs keyed by a hash of the
Content-addressed ID — A content-addressed ID is an identifier derived deterministically from the content of a record via a cryptographic hash, so that two records…
Content Signals — Content Signals is a proposed extension to
Context engineering — Context engineering is the discipline of structuring, filtering,
Context rot — Context rot is the empirically observed degradation of LLM agent
Continued pretraining — Continued pretraining (also called "continual pretraining" or "domain-adaptive pretraining") is the technique of taking an already-pretraine…
Continuous reprediction — Continuous reprediction is the online policy of re-running
Contract-first design — Contract-first design = defining the API contract (endpoints,
Control plane / data plane separation — Architectural split between the "decide" path (control plane: validation,
Conversion-rate uplift — Conversion-rate uplift (CVR uplift) is the standard e-commerce /
Coordinated disclosure — Coordinated disclosure (historically responsible disclosure) is
Copy-on-write merge (compaction) — Copy-on-write merge is the compaction strategy that collapses a
CQRS (Command-Query Responsibility Segregation) — CQRS — Command-Query Responsibility Segregation — is the idea that
Crash-consistent replication — Crash-consistent replication produces recovery points whose state
Critical path (build / pipeline / distributed DAG) — In a DAG of dependent actions (build targets, pipeline steps, tasks), the
Cross-account backup — Cross-account backup is backup / replication written to a
Cross-Cloud Architecture — Cross-cloud architecture is the (deliberate) choice to run
Cross Cluster Replication (CCR) — Cross Cluster Replication (CCR) is Elasticsearch's primitive
Cross-Encoder Reranking — Cross-encoder reranking is a two-stage retrieval pattern where an initial fast retrieval stage returns a candidate set of ~10–100 documents,…
Cross-partition authentication — The design problem of authenticating a workload across two or more
Cross-Region backup — Cross-Region backup is backup / replication written to a
Cross-signed certificate trust (double-signed certificates) — A PKI design where two (or more) isolated root certificate
Cryptographic Shredding — Cryptographic shredding is the technique of rendering data
Cryptographically-relevant quantum computer (CRQC) — A cryptographically-relevant quantum computer is a quantum
Customer-driven metrics — Customer-driven metrics are workload measurements that are
Data classification tagging — Data classification tagging is attaching machine-readable
Data file analysis — Data file analysis is the agent primitive of scanning every file in
Data Mesh — Data mesh is an architectural approach to organisation-wide data
Data parallelism — Data parallelism (DP) is the simplest axis of distributed training: replicate the model across N workers, shard the mini-batch into N pieces…
Data–policy separation — Data–policy separation is the architectural discipline of keeping
Dead-code elimination — Dead-code elimination (DCE) is a linker / compiler pass that
Defense in Depth — Defense in depth is the security posture of stacking
HTTP delta compression (dcb / dcz) — HTTP delta compression is the wire-format layer that carries
Demand-driven replication — Demand-driven replication is the policy of materialising a
Deploy frequency vs caching — Deploy frequency vs caching is the structural tension
Derived subtree — A derived subtree is a subtree of a document tree whose structure
Design Away Invalid States — Design away invalid states is the architectural principle that
Detection-in-depth — Detection-in-depth is the security posture of layering
Deterministic Simulation — Deterministic simulation is a testing discipline where the entire
Device trust — Device trust is the security posture where a specific piece of
Diff noise filtering — Diff noise filtering is the preprocessing step that strips non-reviewable or auto-generated content from a code diff before any AI reviewer …
Digital sovereignty — Digital sovereignty is "managing digital dependencies — deciding
Digital-twin backtesting — Digital-twin backtesting is the technique of running
Disaster recovery tiers (backup / pilot light / warm standby / active-active) — The canonical AWS-lineage disaster-recovery ladder: four tiers
Disk throughput bottleneck — Disk throughput bottleneck — the regime where a system's
Document storage compression — Document storage compression is the per-block compression
DOM node count — DOM node count is the total number of elements the browser has
Downgrade attack — A downgrade attack is an active-adversary manipulation of a
DR configuration translation — DR configuration translation is the problem — and the mechanisms
Drafter-expert split — Drafter-expert split is the architectural primitive under
Durable execution — Durable execution is the property of a long-running
Dynamic backend fallback — Dynamic backend fallback is a resilience pattern where a system
Dynamic schema — field names as data — A document-database schema where field names are not pre-defined
Dynamic sharding — Dynamic sharding treats a service's shard assignment as state owned by a controller, continuously and asynchronously updated in response to …
Early-exit logits — Early-exit logits are the vocabulary-space logit vectors
Earthquake Early Warning (EEW) — Earthquake Early Warning (EEW) is the class of real-time
eBPF verifier — The eBPF verifier is the static-analysis pass the Linux kernel
Edge filtering — Edge filtering is the pipeline design move of dropping / matching
Edge-to-origin database latency — Edge-to-origin database latency is the extra per-query
Egress Cost — Egress cost is the per-byte charge a cloud provider levies when
Egress SNI Filtering — Egress SNI filtering is the pattern of enforcing an outbound
Elasticity — Elasticity is the property that a service's capacity and performance
Elicitation gate — An elicitation gate is the agent-architecture mechanism by
ELT vs ETL — ETL (Extract → Transform → Load) transforms data before loading
Email as agent interface — Email as agent interface is the framing that treats an email inbox
Embedded routing in IP address — Embedded routing in IP address is the addressing design where
Embedding Collection — An embedding collection is the organizational unit of a vector
Empty host — Empty host is the operational primitive of a fully-empty
Envelope Encryption — Envelope encryption is a multi-level key-hierarchy scheme for
Ephemeral credentials — Ephemeral credentials are credentials — keys, tokens,
Erasure coding — Erasure coding is a redundancy scheme that encodes data into more pieces than are needed to read it, so that the data survives the loss of a…
Error propagation — Error propagation is the deterministic downstream consequence of
Estimated Time of Arrival (ETA) — Estimated Time of Arrival (ETA) is the predicted clock time at
Evaluation label — An evaluation label is the unit of offline agent evaluation. It
Event-driven architecture — Event-driven architecture (EDA) is a software-system style in which
Eventual consistency — Eventual consistency is a liveness guarantee: if no new updates are made to a shared value, all observers will eventually converge on the sa…
Execution ladder — The execution ladder is a tiered-environment framing for agent
Expert parallelism (MoE serving) — Expert parallelism is a multi-GPU model-sharding strategy specific to Mixture-of-Experts (MoE) models: different experts (sub-networks activ…
Explicit graphics state — Explicit graphics state is an API-design choice where every
Factuality decoding — Factuality decoding is a category of decode-time
Fast feedback loops — Fast feedback loops = the architectural property that each
Feature freshness — Feature freshness is the service-level property describing
Feature store — A feature store is the class of ML-infrastructure systems that
Federated vs indexed retrieval — Federated retrieval and indexed retrieval are the two
Feedback-control load balancing — Feedback-control load balancing names the class of LB strategies that close a control loop around each backend's observed load: the controll…
File vs. Object Semantics — File semantics (the OS filesystem contract applications have been
Fine-grained authorization — Fine-grained authorization means deciding access at the level of
Fine-Grained Billing — Fine-grained billing is the practice of charging customers at a very
Fingerprinting vector — A fingerprinting vector is any signal — passive or active —
FIPS cryptographic boundary — FIPS cryptographic boundary is the deployment-topology surface
First-principles theoretical limit — First-principles theoretical-limit reasoning asks: given the physics of
Fleet drain operation — Drain is the fleet-operations primitive of relocating every
Fleet patching — Fleet patching is the operational capability of a managed-service
Forward declaration (C++) — A forward declaration in C++ names an identifier (class / struct /
Fragment pruning — Fragment pruning is the design pattern of pushing per-file (per-
Fragmented hardware/software ecosystem — The fragmented hardware/software ecosystem problem is the
Free-space optical communication — Free-space optical communication (FSO) is data transmission through
FUD attack surface — A FUD attack surface is the class of attack where an adversary
Fused decompression + matmul — Fused decompression + matmul is the GPU-kernel pattern in
Game engine architecture — Game engine architecture is the composition discipline in which an
Garbage collection (storage) — In immutable / append-only
Geographic sharding — Geographic sharding partitions data by a location dimension —
GIL contention (Python's Global Interpreter Lock) — Python's Global Interpreter Lock (GIL) is the mutex in the
Git delta compression — Git delta compression is how Git makes its
Git pack file — A Git pack file is Git's on-disk compressed
GitOps — GitOps is the operational discipline of treating a Git
Go build tags — Go build tags are file-level compile guards of the form
Go plugin dynamic-linking implication — Importing the stdlib plugin
Go runtime memory model (virtual vs resident) — In Go (and in most managed-memory runtimes) the virtual memory the runtime has reserved from the OS and the resident set size (RSS) — the ph…
Graphics API abstraction layer — A graphics API abstraction layer is the interface a rendering
Grep loop — The grep loop is an agent failure mode where a documentation
Grey failure — Grey failure names a component that is not fully broken but not
Hard-drive physics (capacity vs. seek-time) — Hard drives are mechanical devices (spinning platters + moving arm + flying head). This constrains them in a way that grows more severe ever…
Hardcoded literal address antipattern — The hardcoded literal address antipattern is the practice of
Hardware offload — Hardware offload is the design pattern of moving work that previously ran on general-purpose CPUs under a general-purpose OS/hypervisor onto…
Hardware/software co-design — Hardware/software co-design is the practice of shaping hardware
Harvest-now, decrypt-later (HNDL) — Harvest-now, decrypt-later (also store-now-decrypt-later
HBM vs. SMEM (GPU memory hierarchy) — Modern NVIDIA GPUs have two relevant memory tiers for inference
Head-of-Line Buffering (in streaming pipelines) — Intermediate layers — reverse proxies, compression middleware, CDN
Heat management (storage) — Heat in a multi-tenant storage system is the number of requests hitting a given drive per unit time. Heat management is the ongoing placemen…
Hermetic build — A hermetic build (or test) declares every input it depends on — source
Heterogeneous data formats — Heterogeneous data formats are the real-world data mix agents must
Heterogeneous fleet config skew — Heterogeneous fleet config skew is the failure mode where
Hexagonal architecture — Hexagonal architecture (Alistair Cockburn, 2005), also known as
Hidden agent directive — A hidden agent directive is a short textual instruction
Horizontal sharding — Horizontal sharding splits a single logical table (or group of related tables) so that its rows live across multiple physical database insta…
Hot key — A hot key is a single key whose request rate is disproportionately higher than the rest of the keyspace. Under any scheme that maps one key …
Hot path — The hot path is the code that runs on every (or
HOV Lane (High-Occupancy Vehicle Lane) — An HOV lane (high-occupancy vehicle lane; colloquially
HTTP/3 — HTTP/3 is the HTTP-over-QUIC version of HTTP —
HTTP 402 Payment Required — HTTP 402 Payment Required is an HTTP client-error status code
HTTP Message Signatures (RFC 9421) — HTTP Message Signatures (published as
Hub-and-Spoke Governance — Hub-and-spoke governance is the posture where a single central
Huffman coding — Huffman coding (Wikipedia)
Hybrid clean/noisy training corpus — A hybrid clean/noisy training corpus is a pre-training dataset
Hybrid key encapsulation — Hybrid key encapsulation is a key-agreement construction where
Hybrid retrieval (BM25 + dense vectors) — Hybrid retrieval is the pattern of combining a lexical index
Hybrid Search (vector similarity + metadata filter) — Hybrid search (in the vector-DB sense) is the retrieval primitive
Hybrid Vector Tiering (Cold S3 ↔ Hot OpenSearch) — Hybrid vector tiering is the storage-and-query pattern that
HyDE (Hypothetical Document Embedding) — HyDE (Hypothetical Document Embedding) is a retrieval technique where, instead of embedding the user's question and searching for documents …
Identity decoupling (user ID vs profile ID) — Separating the internal user entity (the complete record: legal name,
Identity-hiding handshake — An identity-hiding handshake is a cryptographic handshake
Identity vs behavior proof — Identity proof and behavior proof are two distinct answers
Immutable Object Storage — Immutable object storage is a model in which the stored unit — the
In-kernel filtering — In-kernel filtering is the architectural move of evaluating
Incremental delivery (over big-bang rewrites) — Incremental delivery is the posture of shipping architectural change as a sequence of small, observable, reversible steps — each of which de…
Inference compute–storage–network locality — For transaction-shaped inference workloads, the load-bearing
Inference vs training workload shape — Inference and training are fundamentally different workload
Instance-type fallback (prioritised node affinity) — Instance-type fallback is the scheduling pattern where a
Integer overflow — Integer overflow is the condition in which the result of an
Interaction to Next Paint (INP) — Interaction to Next Paint (INP) is a Core Web Vital
Interdependent systems — Interdependent systems are architectures in which a change in one
Interleaving testing — Interleaving testing is an online-evaluation technique for ranking
Intermediate representation bottleneck — A multi-stage pipeline that stages its data through a fixed
Intertoken latency (ITL / time-per-token) — Intertoken latency (ITL), also called time-per-output-token (TPOT), is the LLM-serving latency metric measuring the delay between successive…
Invalidation-based cache — An invalidation-based cache keeps cached values until notified
IP address fragmentation — IP address fragmentation is the phenomenon where a cluster's
IPv6 service mesh — IPv6 service mesh is the architectural stance of giving every
Iterative plan refinement — Iterative plan refinement is the agent-loop discipline where a
JavaScript heap size — JavaScript heap size is the memory the browser's JS engine (V8
JIT peer provisioning — Just-in-time (JIT) peer provisioning is the architectural
JSON-serializable DSL — A JSON-serializable DSL is a domain-specific language whose
JSONL output streaming — JSONL (JSON Lines) is a text format in which every line is a valid, self-contained JSON object. As a streaming protocol for long-running pro…
Kernel attack surface — Kernel attack surface is the set of kernel codepaths a
Kernel panic from scale — A kernel panic from scale is the production failure mode
Kernel state capacity limit — Kernel state capacity limit is the observation that kernel
kill / copy / boot migration tradeoff — The kill / copy / boot migration tradeoff is the classical
Killswitch subsystem — A killswitch subsystem is a mechanism inside a rules engine,
Knowledge Distillation — Knowledge distillation is the technique of transferring
Knowledge graph — A knowledge graph is a data structure that captures
Kubernetes label length limit (63 characters) — Kubernetes metadata label values are limited to 63 characters.
KV cache (transformer inference) — The KV cache is the per-layer, per-token Key and Value
Language extension vs replacement — Language extension vs replacement is the design question that shows
Latent misconfiguration — Latent misconfiguration is a configuration bug that is
Layer 7 load balancing — Layer 7 load balancing makes routing decisions at the application protocol layer (HTTP headers, gRPC method, request path, query parameters)…
Lazy Hydration — Lazy hydration is an initialization pattern: on first exposure of
Learned lifetime distribution — Learned lifetime distribution is the prediction unit of
Least-privileged access — Each actor — user, service, operator — sees only the data and gets only
Lift metric — In an interleaving test of ranking A vs
Lightweight formal verification — Lightweight formal verification is a family of techniques that sit between ad-hoc testing and heavyweight proof-based formal methods (TLA+, …
Link response header (RFC 8288) — Link: is an HTTP response header standardised as
Linux cGroup — A Linux cGroup (control group) is a kernel primitive that
Linux namespaces — Linux namespaces are the kernel primitive that isolates
Live-off-the-land — Live-off-the-land is the architectural posture of accomplishing
LLM as Judge — LLM-as-judge is the evaluation pattern in which one LLM scores
LLM decoding step — The decoding step is the final phase of LLM text generation
LLM hallucination — LLM hallucination is the failure mode where a language model
llms.txt — llms.txt is a plain-text file at the root of a website (e.g.
Local emulation — Local emulation = running a faithful-enough substitute for a
Local-first architecture — Local-first architecture is a system-design posture in which
Local-remote parity — Local-remote parity is the design goal of making local
Locality-aware scheduling — Locality-aware scheduling is the scheduler policy of placing a
Logical replication — Logical replication is a replication mode in which the primary
Logical vs physical sharding — Logical sharding and physical sharding are two different states in the horizontal-sharding rollout. Decoupling them — doing logical before p…
Logits (LLM / transformer) — Logits are the pre-softmax prediction scores a transformer
Logless Reconfiguration — Logless reconfiguration is MongoDB's replica-set
Long Fork anomaly — The Long Fork anomaly is a violation of Snapshot Isolation's atomic-visibility property in which two readers…
Long-lived key risk — Long-lived key risk is the principle that the impact of a
Lossless weight compression — Lossless weight compression is the problem class of reducing
Low-bit inference — Low-bit inference is the umbrella practice of serving
LSM compaction (size-tiered, leveled, hybrid) — Log-Structured Merge (LSM) is the standard storage organisation
Machine metadata service — A machine metadata service is a guest-local endpoint that a
Machine-readable documentation — Machine-readable documentation = repository docs structured for
Maintenance window — A maintenance window is a customer-configurable contract with a
Managed data plane — Managed data plane is the architectural primitive where the
Managed Kubernetes service — A managed Kubernetes service is a hosted offering where the cloud
Markdown content negotiation — Markdown content negotiation is the convention that when a
Matrix multiply-accumulate (MMA) — Matrix multiply-accumulate (MMA) is the fused primitive C ←
MCP Server Card — MCP Server Card is a draft well-known JSON document that
Mean Reciprocal Rank (MRR) — Mean Reciprocal Rank (MRR) is a statistical metric for
Memory-aware scheduling — Memory-aware scheduling is the scheduler policy of sizing and
Memory-bandwidth-bound inference — Memory-bandwidth-bound inference is the regime in which
Memory-bound vs compute-bound (GPU inference) — A workload is memory-bound when its performance is limited by
Memory compaction — Memory compaction (also called context compaction) is the lifecycle moment in an agent loop at which the [[concepts/agent-context-window|con…
Memory safety — Memory safety is the property that a program cannot access memory it isn't authorized to — no use-after-free, no buffer overrun, no double-f…
Memory supersession — Memory supersession is the discipline of updating an agent-memory fact or instruction by keeping the old memory, linking it forward to the n…
Merge-Base Three-Tree Sync — Merge-base three-tree sync is a file-sync data model that persists
Merge-on-read (MOR) — Merge-on-Read (MOR) is the row-level update strategy in open table
Metadata boost — Metadata boost is a query-time ranking adjustment that nudges results up or down based on document metadata values (recency, priority, regio…
Metric-granularity mismatch — Metric-granularity mismatch is the observability failure mode
Metric temporality (delta vs cumulative) — Temporality is how a metric data point expresses change over time:
Metric-type metadata — An explicit, authoritative record of each metric's type
Micro-VM as Pod — Micro-VM as Pod is the architectural stance of making the
Micro-VM Isolation — Micro-VM isolation is the practice of running each tenant's code inside a
Min-cost flow — Min-cost flow is a class of graph algorithms that find the
ML-first architecture — ML-first architecture is a chip-design posture that reverses
Model FLOPs Utilization (MFU) — Model FLOPs Utilization (MFU) is the ratio of FLOPs actually
Monitoring paradox — Monitoring paradox: the observability layer deployed to catch
Monorepo — Monorepo = a single shared version-control repository holding
Multi-document ACID transactions — A multi-document ACID transaction bundles multiple document writes — potentially across multiple collections, and in MongoDB's 2019+ version…
Multi-GPU serving (LLM) — Multi-GPU serving is the LLM-inference regime in which a single model instance spans multiple GPUs because the model's weights + working-set…
Mutual TLS (mTLS) — Mutual TLS (mTLS) is TLS with two-way peer authentication:
NDCG (Normalized Discounted Cumulative Gain) — Normalized Discounted Cumulative Gain (NDCG) is a standard
Network round-trip cost — The round-trip-time (RTT) floor between an application process and a
Neurosymbolic AI — Neurosymbolic AI is the composition of neural methods (LLMs, transformers, RL) with symbolic methods (mechanical theorem provers, SAT/SMT so…
NMSE (Normalized Mean Squared Error) — Normalized Mean Squared Error (NMSE) is the graded-scale
Nodeless Kubernetes — Nodeless Kubernetes is the architectural stance of running a
noindex meta tag — The noindex meta tag is an HTML directive, typically in
Noise injection in evaluation — Noise injection in evaluation is the counter-intuitive discipline
Noise Protocol Framework — The Noise Protocol Framework is Trevor Perrin's
Noisy neighbor — Noisy neighbor names the multi-tenant failure mode where one tenant's workload perturbs another tenant's latency/throughput — through shared…
NoSQL database — A NoSQL database is a non-relational datastore — it does not require data to be laid out in fixed-schema tables joined via foreign keys and …
OAuth Protected Resource Metadata — OAuth Protected Resource Metadata (RFC 9728)
Object-tree document model — An object-tree document model represents a document as a rooted
Observability — The function of providing visibility into application performance and
OIDC federation for cloud access — OIDC federation for cloud access is the architectural pattern
OLTP vs OLAP — OLTP (Online Transaction Processing) and OLAP (Online Analytical
On-Device ML Inference — On-device ML inference is the running of ML inference on
One-to-one agent instance — The one-to-one agent instance is the structural observation
Online scheduling — Online scheduling is the problem class where "jobs arrive
Open Table Format — An open table format (OTF) is a metadata layer over columnar data
Operational Transform (OT) — Operational Transform (OT) is the dominant real-time-collaboration
Ownership (Amazon's organizational primitive) — Ownership, at Amazon, is the organizational primitive that says a single person or team is end-to-end accountable for a service or outcome: …
Packet sniffing as event source — Packet sniffing as event source is the move of
Padding removal (variable-length inference) — Padding removal (also called variable-length processing) is
Parameter binding — A parameter binding is the edge in a
Parameter system — A parameter system defines values that can be set once and
Parser differential — A parser differential is a security-relevant disagreement between two
Partition colocation (cross-topic) — Partition colocation (cross-topic) is the property that
pass@k — pass@k asks: for a given scenario, over k independent attempts,
Passive vs active client signals — Cloudflare's 2026-04-21 post taxonomizes the signals an origin
PEG grammar (Parsing Expression Grammar) — A Parsing Expression Grammar (PEG) is a formal grammar class
Per-account quotas — Most AWS service limits ("service quotas") are enforced per
Per-tenant search instance — A per-tenant search instance is an isolated (storage, index) unit, one per tenant (agent, customer, session, language, region, …), created a…
Performance isolation — Performance isolation is the property that one tenant's workload does not observably affect another tenant's latency or throughput — distinc…
Performance per watt — Performance per watt is the ratio of useful work (throughput,
Performance prediction — Performance prediction is the problem class of estimating a
Permissions DSL — A permissions DSL is a dedicated domain-specific language for
PID controller (feedback control) — A PID controller (proportional–integral–derivative) is a 90-year-old control-theory primitive that drives a process variable toward a setpoi…
Pipeline parallelism — Pipeline parallelism is a multi-GPU model-sharding strategy in which different transformer layers live on different GPUs — GPU 0 holds layer…
PK Token — A PK Token is an OpenID Connect
Pod Disruption Budget — A Pod Disruption Budget (PDB) is a Kubernetes primitive that
Point of presence — A Point of Presence (PoP) is a physical deployment site
Policy-as-data — Policy-as-data stores authorization policies outside application
Post-quantum authentication — Post-quantum authentication is the migration of digital-
Post-quantum cryptography — Post-quantum cryptography (PQC) is the class of cryptographic
Postgres MVCC and Heap-Only Tuple (HOT) updates — MultiVersion Concurrency Control (MVCC) in Postgres is implemented by writing a
Predictive auto-scaling — Predictive auto-scaling is capacity control that scales a
Prefill/decode (PD) disaggregation — Prefill/decode disaggregation (or PD disaggregation) is an LLM-serving architectural pattern that separates the two stages of LLM inference …
Prefix-aware routing (LLM inference) — Prefix-aware routing is the inference-request routing strategy
Primary-Replica Topology Alignment — Primary-replica topology alignment is the structural principle
Program correctness — Program correctness is the property that a program produces the
Project rules / steering files — Project rules (also "steering files", "agent rules") =
Promise allocation overhead — Promise allocation overhead is the per-call CPU + memory
Prompt-boundary sanitization — Prompt-boundary sanitization is the practice of stripping any occurrence of the structural delimiters of an LLM prompt from user-controlled …
Publish-time immutability — Publish-time immutability is the semantics in which a composite
Pull vs push streams — A streaming API is fundamentally either pull-based (the
Push-based invalidation — Push-based invalidation is a reactive-update strategy: maintain an
Q-Day — Q-Day is the day a [[concepts/cryptographically-relevant-quantum-computer|
Quantization — Quantization rescales tensor elements from a high-precision
Query shape — A query shape is the un-parameterized form of a query,
Query vs document embedding (two distinct serving problems) — In retrieval / search / recommendation systems that use
Queueing theory (as applied to storage/IO stacks) — Queueing theory is the math of how waiting lines form and drain when arrivals are asynchronous. Applied to systems: between the CPU and dura…
Rack-level power density — Rack-level power density is the amount of electrical power a
Radiation effects on computing — Radiation effects on computing is the failure-mode class introduced when
RAG as a judge — RAG as a judge is the pattern of letting an
Rate-limit trilemma — The rate-limit trilemma is Cloudflare's framing for a fundamental
Rate-limited cache — A rate-limited cache is a lookup cache in front of an
RDMA KV transfer — RDMA KV transfer is the serving-infrastructure primitive of moving LLM KV cache blocks between GPUs (intra-node) or no…
Reachability-based subscription — Reachability-based subscription is a server-side real-time
React Hydration — Hydration is the process by which a frontend React runtime takes
React re-render — React re-render is the cost of React re-invoking a component's
Reactive auto-scaling — Reactive auto-scaling is capacity control that observes
Read-invalidation rendezvous — Read-invalidation rendezvous is the concurrency problem of
Real User Measurement — Real User Measurement (RUM) is performance data collected
Reciprocal Rank Fusion (RRF) — Reciprocal Rank Fusion (RRF) is a score-fusion technique for combining result lists from multiple independent retrieval methods into a singl…
Refinement-round budget — Refinement-round budget is the bounded-iteration discipline of a
reflect.MethodByName linker pessimism — reflect.MethodByName(name) with a non-constant name is
Relative Score Fusion (RSF) — Relative Score Fusion (RSF) is a score-fusion technique for combining result lists from multiple independent retrieval methods into a single…
Release attestation — A release attestation is a signed cryptographic statement that a
Relevance labeling — Relevance labeling is the activity of assigning a graded
Remote Build Execution (RBE) — Remote Build Execution (RBE) distributes build/test actions across a
Remote development environment — A remote development environment is an architectural setup
Replay training — Replay training (also: "rehearsal", "experience replay" in a related RL sense) is the technique of including examples from a model's previou…
Repo-per-agent-session — Repo-per-agent-session is the pattern-concept of giving every
Resilient inference stream — A resilient inference stream is an LLM-inference response
Resource stranding — Resource stranding is the failure mode where a server's
Response-body sampling — Response-body sampling is the technique of inspecting a
Risk-tier assessment — Risk-tier assessment is the discipline of classifying each code change into a small number of risk buckets before deciding how much review f…
robots.txt — robots.txt is a text file at the root of a site
RPO / RTO (recovery point / time objectives) — The two canonical Disaster Recovery budget dimensions:
Same-origin dictionary scope — Same-origin dictionary scope is the constraint — enforced
SAML authentication bypass — A SAML authentication bypass is any vulnerability that lets an
Saturation point (inference latency vs token count) — The saturation point is the token-count threshold on a specific
Scale to Zero — Scale-to-zero is a service-design property in which an application
Scaling latency — Scaling latency is the time between "we need more capacity" and
Scatter-gather query — A scatter-gather query is a query executed in a sharded system that cannot be routed to a single shard (because its predicate doesn't includ…
Schema evolution — Schema evolution is the problem of changing the structure of
Schema registry — A schema registry is a centralized, versioned store of data
Seasonality (daily / weekly) — Seasonality is the property of a time series in which values
seccomp — seccomp (short for secure computing mode) is a Linux
Seconds-scale GPU cluster boot — The property of a compute platform where a multi-node cluster of
Self-authored extension — A self-authored extension is a tool that an AI agent
Self-censoring forecast — A self-censoring forecast is a predictor that scores its
Self-invalidating forecast — A self-invalidating forecast is the hazard class where a
Self-service infrastructure — Self-service infrastructure is the platform-engineering property
Sensitive data exposure — Sensitive data exposure is a security vulnerability in which
Server-Side Sandboxing — Server-side sandboxing (also called workload isolation) is the practice
Serverless Compute — Serverless compute is a model in which the provider runs customer code on
Service Control Policy (SCP) — A Service Control Policy (SCP) is an [[systems/aws-organizations|
Service coupling — Service coupling is the degree to which services depend on each
Service topology — Service topology is a configuration abstraction answering "at
Session-affinity prompt caching (x-session-affinity) — Session-affinity prompt caching is the LLM-serving pattern of routing subsequent turns of a session back to the same replica / region that s…
Shard key — A shard key is the column (or composite) whose value selects which physical shard a row lives on under [[concepts/horizontal-sharding|horizo…
Shared-context fan-out — Shared-context fan-out is the pattern of writing large common context (merge-request metadata, previous review findings, diff summaries) to …
Shared-dictionary compression — Shared-dictionary compression is an HTTP-level compression
Shared Responsibility Model — The Shared Responsibility Model is AWS's contract-level framing
Short-lived credential auth — Short-lived credential auth is the security property that a
Similarity-tier retrieval — Similarity-tier retrieval names the product + eval constraint
Simplicity vs. Velocity — A first-class engineering concept in S3's 2025 retrospective: there is
Single-endpoint abstraction — Single-endpoint abstraction is the architectural principle of
Singleton workload — A singleton workload is a service that runs as a single
Sitemap — A sitemap is an XML file listing every URL on a site, plus
Slowly-Changing Dimension (SCD) — Slowly-Changing Dimension (SCD) is the dimensional-modeling pattern
Small map as sorted vector — Represent an associative container whose key count is empirically
Snapshot Isolation — Snapshot Isolation (SI) is a transaction-isolation model in which each transaction reads from a consistent snapshot of the database as of it…
Soft leader election — Soft leader election designates one pod as the coordinator for a given key by routing affinity — if all requests for a key land on the same …
Source-map composition — A source map is a file that maps locations in compiled output (JS, WASM,
Space-based compute — Space-based compute is the architectural choice to deploy the compute
Sparse Vector — A sparse vector is a vector representation where most components are zero, stored and queried via a coordinate format ({index: value} pairs)…
Specification-driven development — Specification-driven development is a workflow where the specification is a first-class, authored, maintained artifact — produced early, vis…
Speculative decoding — Speculative decoding is an LLM-inference latency-optimization
Speech recognition (ASR) — Automatic Speech Recognition (ASR) is the ML primitive that
Speed-Accuracy Trade-off (Real-Time Decisions) — Structural property of real-time decision systems where the data
Spiky traffic — Spiky traffic is the pattern where request arrivals have high
Split-brain — Split-brain is the failure mode in which two (or more) nodes each believe they have authoritative ownership of the same resource — typically…
SSO authentication (OpenID Connect) — Single sign-on (SSO) is the pattern where a user authenticates
Stack-trace sampling profiling — Stack-trace sampling profiling is a production profiling
Stage and Commit — Stage and commit is a synchronisation pattern borrowed from version
Stateless Compute — Stateless compute is a contract in which the execution environment
Static sharding — Static sharding pins application keys to backend nodes using a fixed scheme — most commonly consistent hashing — computed by clients without…
Storage overhead and fragmentation — Storage overhead is the ratio of raw capacity consumed to
Stream adapter overhead — Stream adapter overhead is the allocation / copy / buffering
Streaming aggregation — Streaming aggregation is the pattern of aggregating metrics in
Streaming SSR — Streaming server-side rendering is an SSR variant where the server
Strong Consistency (Read-after-Write) — Strong read-after-write consistency is the guarantee that once a
Structured output reliability — Structured-output reliability is the quality axis separate from
Sub-topology (Kafka Streams) — A sub-topology in systems/kafka-streams is a connected
Synchronization tax — Synchronization tax is the ongoing engineering + operational
Synchronous vs asynchronous GPU readback — Readback is copying pixel or buffer data back from the GPU
Syscall allowlist — A syscall allowlist is the policy artefact that names the
Tag protection — Tag protection is a Git-server-side invariant that, once a named
Tagged pointer — Pack non-pointer data into the architecturally-unused high or low
Tail latency at scale — "Tail latency at scale" names the failure mode where, as a system fans a single logical operation out across N hosts, the probability that a…
Task and actor model — The task and actor model is a low-level distributed-compute
Telemetry-based resource discovery — Telemetry-based resource discovery is the technique of inferring
Telemetry TTL as a one-way door — Telemetry data has a TTL. Metrics, logs, and traces expire on
Temporal logic specification — Temporal logic is a family of formal languages for specifying behaviors of systems over time — "X eventually happens", "A always precedes B"…
Tenant isolation — Tenant isolation in a multi-tenant SaaS is the property that one
Tensor parallelism — Tensor parallelism is a multi-GPU model-sharding strategy in which individual weight matrices (tensors) within each transformer layer are sp…
Tentative schedule — Tentative schedule is an algorithmic primitive for online
Test Case Minimization — Test case minimization (also: shrinking) is the step in a
Test sensitivity — Test sensitivity (closely related to statistical power) is the
Text-to-text regression — Text-to-text regression is numeric prediction done by a
TGW Appliance Mode — Appliance Mode is a property of an [[systems/aws-transit-gateway|
Threat modeling — Threat modeling is the discipline, originating in security engineering, of enumerating threats against a system before deciding on counterme…
Three-database problem — The three-database problem is the named infrastructure failure
Three-valued logic — Three-valued logic extends classical boolean logic with a third
Thundering herd — A thundering herd is a failure mode where a resource is
Tier-based instance sizing — Tier-based instance sizing is the capacity abstraction where
Tight migration scope — Tight migration scope is the posture of constraining a large
Time to first token (TTFT) — Time to first token (TTFT) is the LLM-serving latency metric measuring the delay between a request arriving at the server and the first outp…
Token-aware load balancing (LLM serving) — Token-aware load balancing is the admission-control / routing primitive for LLM-serving load balancers in which the balancer's per-endpoint …
Token-count-based batching — Token-count-based batching is the GPU-inference-serving discipline
Token enrichment — Token enrichment is the practice of adding authorization-relevant
Token verification — Token verification is the serving-side primitive that
Tool-selection accuracy — Tool-selection accuracy is an LLM agent's probability of
Training / serving boundary — The training / serving boundary is the organizational and
Trajectory evaluation — Trajectory evaluation scores an agent on how it investigated,
Transitive-dependency reachability — Transitive-dependency reachability is the graph-theoretic
Transitive parameter resolution — Transitive parameter resolution is the process of resolving a
Transparent cluster code distribution — The runtime property where code (module definitions, function
Trie data structure — A trie (pronounced "try" or "tree") is a tree-based data
TRIM / DISCARD integration — TRIM / DISCARD integration is the filesystem-to-block-layer
Trimean aggregation — The trimean is a single-number summary of a distribution
Tunable consistency — Tunable consistency is the property that a database lets applications choose the consistency + durability level per operation, rather than f…
UDP reflection + amplification — UDP reflection+amplification is a volumetric-DDoS technique
Uncertainty quantification — Uncertainty quantification (UQ) is the discipline of
Unified interface schema — A unified interface schema is a single machine-readable
Unified model catalog — A unified model catalog is the product-surface property of an
Unified storage and index — Unified storage and index is the managed-service property that a single write both stores the document and indexes it atomically, with no cu…
Uniform buffer — A uniform buffer is a chunk of GPU-visible memory that holds
Universal resource provisioning — Universal resource provisioning is the abstraction of every
Unix-socket API proxy — A Unix-socket API proxy is a local IPC endpoint — a Unix domain
Unlinkability — Unlinkability is the cryptographic property that a token
Unsigned right shift (>>>) — Unsigned right shift (Java spelling: >>>) is the bit-shift
User Action as Token — User-action-as-token is a recommendation-system framing that treats
V8 young generation — The young generation (aka "young space", "nursery") is the
Vector Embedding — A vector embedding is a dense numerical representation of a piece
Vector Quantization — Vector quantization (in the context of vector search) is
Vector Similarity Search — Vector similarity search is the retrieval primitive behind
Verified Bots — Verified bots is the general problem of distinguishing
Vertical partitioning — Vertical partitioning splits a monolithic database by moving groups of related tables onto separate database instances. Each moved table sta…
Visibility order vs. commit order — Visibility order vs. commit order names the architectural decision in an MVCC database about whether the sequence in which committed transac…
VM Escape — A VM escape is when code running inside a guest virtual machine breaks
VM lifetime prediction — VM lifetime prediction is the problem class of predicting
Write-Ahead Logging (WAL) — Write-Ahead Logging (WAL) is the durability primitive under nearly every
Warm isolate routing — Warm isolate routing is the scheduling policy used by V8-
Wasm Git server — Wasm Git server names the implementation shape in which the
WebAssembly — WebAssembly (often WASM) is a W3C-standardized binary
AWS Well-Architected Framework — The AWS Well-Architected Framework is AWS's design-review
Well-known URI (RFC 8615) — A well-known URI is a URL with a path starting /.well-known/
What Not To Flag Prompt — "What NOT to flag" is the prompt-engineering discipline of spending more explicit instruction on what an LLM should skip than on what it sho…
Window virtualization — Window virtualization (a.k.a. virtual scrolling / virtual
Winning-indicator t-test — In interleaving testing of two rankings
WiredTiger cache — The WiredTiger cache is the in-memory buffer pool that MongoDB's
WireGuard handshake — The WireGuard handshake is the exchange of UDP packets that
Word Error Rate (WER) — Word Error Rate (WER) is the canonical metric for
Working-set memory — Working-set memory — the subset of a database's data + indexes
Workload-aware routing — Workload-aware routing is the architectural pattern of making
Workload identity — Workload identity is the property that a process (a VM, a
Write amplification — Write amplification (WA) is the ratio of physical bytes
Write-dependency graph — A write-dependency graph is a server-side (or client-side) data
xDS protocol — xDS is Envoy's family of dynamic-configuration APIs: a streaming gRPC protocol over which a control plane pushes cluster, endpoint, listener…
XML signature wrapping — XML Signature Wrapping (XSW) is a family of attacks against XML-DSig
Zero-copy sharing — Zero-copy sharing means two or more processes (or languages
Zero-knowledge proof — A zero-knowledge proof (ZKP) is a cryptographic protocol by
Zero-trust authorization — Zero-trust authorization is the design principle that every tier