SYSTEM Cited by 5 sources
Temporal¶
Temporal is a durable workflow-orchestration engine — write workflow logic as regular code; Temporal handles persistence, retry, rollback, long-running timers, and replay across process restarts. Originally a fork of Uber's Cadence; now a standalone OSS + commercial offering.
Stub page — expand on future Temporal-internals sources.
Cluster architecture¶
A Temporal cluster is a Temporal Server paired with a persistence layer. The Temporal Server itself decomposes into four independently-scalable services (Source: ):
- Frontend gateway — rate limiting, routing, and authorisation for incoming client traffic.
- History subsystem — maintains per-workflow mutable state and timers; internally partitioned across shards (see Shilkov 2021 for sizing).
- Matching subsystem — hosts task queues and dispatches work to registered workers.
- Worker service — handles Temporal's own internal background workflows (not user-written workflows, which run in the customer's worker processes).
Each subsystem scales against its own load profile. The persistence layer (see below) is pluggable and sits beneath all four.
Rehydration via event-history replay¶
Temporal's durability guarantee is mechanically simple (Source: ):
Temporal captures the progress of a workflow execution (or workflow steps) in a log called the history. In case of a crash, Temporal rehydrates the workflow; that is, Temporal restarts the workflow execution, deduplicates the invocation of all activities that have already been executed, and catches up to where it previously left off.
Event history is the append-only log; replay + activity dedup is the recovery path. This is the Temporal-specific embodiment of durable execution at the step-sequencer / workflow-engine altitude — the WAL-for-user-space shape applied to workflow code.
Persistence layer¶
Temporal's persistence layer is pluggable between SQL (MySQL / Postgres) and NoSQL (Cassandra), with opposing operational envelopes:
- SQL — familiar ops, single-node write ceiling caps Temporal throughput at that one box unless you shard manually.
- NoSQL — linear horizontal scalability for Temporal's workload shape; inherits Cassandra's tuning surface (compaction, tombstones, repair, gossip cluster health).
Six categories of data live in the persistence layer: tasks to dispatch, workflow execution state, mutable execution state, event history (the replay substrate), namespace metadata, and visibility data (for queries like "show all running Workflow Executions").
Because Temporal already uses horizontal partitioning internally (History subsystem), a horizontally-sharded backing store (Vitess / PlanetScale / Cassandra) composes cleanly: shard counts on either side scale independently. PlanetScale's pitch, per Longoria 2022-07-22, is to collapse the SQL-vs-NoSQL trade-off: "If you choose PlanetScale, you get both: operational simplicity and scalability." See concepts/temporal-persistence-layer for the full trade-off framing.
Role in the wiki¶
Datadog CDC pipeline provisioning (2025-11-04)¶
Datadog uses Temporal workflows as the automation layer over
CDC pipeline provisioning. The manual runbook for standing up
a Postgres-to-Kafka-to-Elasticsearch pipeline is 7 steps (enable
wal_level=logical, create Postgres users, create publications +
slots, deploy Debezium, create Kafka topics, set up heartbeat
tables, configure sink connectors); Datadog decomposes each step
into a Temporal activity and composes them into higher-level
provisioning workflows.
"Using Temporal workflows, we broke the provisioning process into modular, reliable tasks — then stitched them together into higher-level orchestrations. This made it easy for teams to create, manage, and experiment with new replication pipelines without getting bogged down in manual, error-prone steps." (Source: sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform)
Durable-execution properties the workflow-engine layer inherits: per-activity retry, replay across worker crashes, first-class long-running timers (waiting for a slot LSN to advance, waiting for a sink to catch up), compensation branches for partial-failure cleanup, and a queryable event history per provisioning instance. See patterns/workflow-orchestrated-pipeline-provisioning for the general shape.
OSS-on-Kubernetes case study (Figma, 2024-08-08)¶
Cited as a concrete example of OSS software teams at Figma wanted to run: easy to install on Kubernetes via Helm, but would have required hand-porting into systems/terraform on ECS. Part of the motivating case for the ECS→EKS migration. (Source: sources/2024-08-08-figma-migrated-onto-k8s-in-less-than-12-months)
Instacart Maple — LLM batch pipeline orchestration (2025-08-27)¶
Maple, Instacart's internal batch-LLM- processing service, uses Temporal as the durable-execution substrate for its multi-batch LLM workflow: every activity (encode, upload, poll, download, decode, merge) is a Temporal activity; the overall job is a Temporal workflow. The specific motivation:
"Running batch jobs at this scale means errors are inevitable — network issues, provider failures, or bugs can happen mid-run. We use the Temporal durable execution engine to ensure that jobs can resume exactly where they left off without losing any work. This not only protects against data loss but also avoids wasting money on partially completed jobs." (Source: sources/2025-08-27-instacart-simplifying-large-scale-llm-processing-with-maple)
LLM batch APIs bill on submission, not completion — so durable execution's normal crash-recovery argument is amplified: losing state mid-workflow doesn't just mean re-doing work, it means re-paying for work already done. Canonical instance of patterns/llm-batch-processing-service built on Temporal.
Seen in¶
-
sources/2026-04-28-airbnb-skipper-building-airbnbs-embedded-workflow-engine — first wiki-canonical statement of Temporal as the contrast case for an embedded-library workflow engine. Airbnb's Skipper is designed explicitly against the Temporal model for Tier 0 services: "External orchestration engines are the industry gold standard for durable workflow execution, providing exactly-once semantics and battle-tested reliability. However, they require dedicated infrastructure — a cluster of servers and a persistence layer, along with operational expertise — to maintain. For our highest-criticality, 'Tier 0' services … adding a new critical dependency was problematic. An orchestration cluster outage would mean every dependent service would lose the ability to start or advance workflows." Skipper's two explicit divergences from Temporal's model: (1) library-in-service rather than external cluster (see concepts/embedded-workflow-engine); (2) state-field replay rather than event-history replay (see concepts/workflow-replay-from-checkpointed-actions) — "Unlike event-sourced orchestration systems that reconstruct state by replaying an entire event history, Skipper persists state fields directly. There's no event log to replay, just current state and checkpointed action results. This makes execution leaner … though it trades some auditability for that efficiency." Shared primitives across the two models: deterministic workflow methods, checkpointed activities/actions, signals /
@SignalMethod, durable waits, compensation. Skipper explicitly notes that teams needing "cross-language support or cross-service orchestration may find a dedicated orchestration system more appropriate" — Temporal's native fit. -
— canonical wiki disclosure of Temporal's four-subsystem cluster decomposition (Frontend gateway / History / Matching / Worker service), the six categories of persistence-layer data (tasks, execution state, mutable state, event history, namespace metadata, visibility data), and the SQL-vs-NoSQL operational-simplicity-vs- scalability trade-off framing. Savannah Longoria (2022-07-22) Part 1 of a product-pairing tutorial positioning PlanetScale / Vitess as the horizontally-sharded backing store that composes cleanly with Temporal's own internal History-subsystem sharding. Shilkov 2021 cited as the canonical shard-sizing reference.
- sources/2025-11-04-datadog-replication-redefined-multi-tenant-cdc-platform — automation substrate under Datadog's managed multi-tenant CDC replication platform; wraps Debezium + Kafka Connect pipeline provisioning into durable workflows. Canonical wiki instance of patterns/workflow-orchestrated-pipeline-provisioning.
- sources/2025-08-27-instacart-simplifying-large-scale-llm-processing-with-maple — canonical wiki instance of Temporal as LLM batch pipeline substrate. Protects against data loss AND avoids paying for partially completed LLM batches (batches bill on submit not complete).
- sources/2024-08-08-figma-migrated-onto-k8s-in-less-than-12-months — cited as OSS-on-Kubernetes example motivating the Helm-ecosystem pitch; not a Temporal-internals ingest.
-
sources/2025-02-12-flyio-the-exit-interview-jp-phillips — lineage citation via Cadence. JP Phillips, who built Fly.io flyd, names Cadence (Temporal's predecessor) as one of two direct influences on flyd's FSM design; also frames Cadence as "child of AWS Step Functions and the predecessor to Temporal (the company)" — useful single-source lineage statement.
-
— canonical wiki disclosure of Temporal's two hardest operational invariants: (1) Temporal serialises all updates on a single shard — "Temporal serializes all updates belonging to the same shard, so all updates are sequential. As a result, the latency of a database operation limits the maximum theoretical throughput of a single shard." This is the correctness constraint that makes single-shard throughput latency-bound, not bandwidth-bound. (2)
numHistoryShardsis immutable after initial cluster deployment — "you must set this value high enough to scale with this Cluster's worst-case peak load" — the only sharded-system shard count on the wiki with a hard no-reshard property. Savannah Longoria's Part 2 (2022-12-14) is the operational sibling to Part 1's four-subsystem architectural disclosure. Also documents production Temporal-on-PlanetScale VSchema ( two-keyspace split with 13 sharded + 14 unsharded tables,xxhashPrimary Vindex onshard_id/range_hash) and empirical Black-Friday-to-Cyber-Monday QPS ledger (40k–200k sustained, peaks to 180k). First wiki disclosure that PlanetScale uses Temporal internally to automate Vitess release workflows — self-composition of product primitives.
Related¶
- systems/kubernetes — deployment substrate for hosted Temporal.
- systems/cadence — predecessor project, same core team pre-Temporal.
- systems/flyd — Fly.io orchestrator whose FSM design is ancestry-linked to Cadence, the Temporal predecessor.
- systems/maple-instacart — canonical LLM-batch wiki instance.
- systems/planetscale / systems/vitess — horizontally- sharded MySQL substrate pitched as the persistence-layer option that collapses the SQL-vs-NoSQL trade-off.
- systems/mysql / systems/postgresql — the SQL arm of the pluggable persistence layer.
- systems/apache-cassandra — the NoSQL arm of the pluggable persistence layer.
- concepts/durable-execution — the broader concept Temporal realises.
- concepts/fault-tolerant-long-running-workflow — the scale- derived correctness framing that applies to any long-running event-history-driven workflow, Temporal included.
- concepts/temporal-persistence-layer — canonical wiki page for the SQL-vs-NoSQL trade-off + six data categories.
- concepts/horizontal-sharding — the partitioning primitive Temporal uses internally (History subsystem) and expects the backing store to expose for scale.
- concepts/serialized-per-shard-updates — Temporal's correctness discipline: all updates on a single shard are applied sequentially.
- concepts/single-shard-throughput-ceiling — direct consequence: per-shard throughput = 1 / persistence-latency.
- concepts/num-history-shards-immutability — the immutable-after-deploy sizing constraint that makes worst-case-peak-load capacity planning unavoidable.
- patterns/workflow-orchestrated-pipeline-provisioning — canonical pattern Temporal is used for in the wiki.
- patterns/llm-batch-processing-service — LLM-batch variant.
- patterns/split-sharded-plus-unsharded-keyspaces — production-validated VSchema shape for running Temporal on Vitess / PlanetScale.