Skip to content

PATTERN Cited by 1 source

Durable-Object-backed Git server

Durable-Object-backed Git server is the substrate pattern for serving a Git repo per storage unit: one [[systems/ cloudflare-durable-objects|Durable Object]] per repo hosts the Wasm Git server + embedded storage; a front-end Worker handles auth, metrics, and DO lookup; R2 holds pack-file snapshots; KV tracks auth tokens. Introduced to the wiki by Cloudflare's 2026-04-16 [[systems/ cloudflare-artifacts|Artifacts]] launch.

Shape

   git client / REST caller
 ┌──────────────────────────────────────┐
 │  Worker (stateless, per-request)     │
 │  • authn/authz (token via KV lookup) │
 │  • metrics (errors, latency)         │
 │  • repo_id → DO routing              │
 └──────────────────────────────────────┘
 ┌──────────────────────────────────────┐
 │  Durable Object (one per repo)       │
 │  • single-writer ordering            │
 │  • ~128 MB memory budget             │
 │  • Zig WASM Git server (~100 KB)     │
 │  • embedded SQLite (state.storage.kv)│
 │     · git objects (chunked, ≤2MB/row)│
 │     · deltas + base hashes alongside │
 │       resolved objects               │
 └──────────────────────────────────────┘
      │                     │
      ▼                     ▼
 ┌──────────────┐   ┌──────────────┐
 │  R2          │   │  KV          │
 │  pack-file   │   │  auth token  │
 │  snapshots   │   │  tracking    │
 └──────────────┘   └──────────────┘

What each layer does

Front-end Worker

  • Stateless — just a request-scoped compute unit, no persistent state of its own.
  • Authenticates the caller (token in HTTPS URL / header; look up in KV).
  • Metrics — error rate, latency, per-namespace counters.
  • Routes to the correct DO instance via repo_id (derived from URL + namespace).
  • Hosts the Wasm Git engine inside the request; on many implementations the engine lives in the DO, not the Worker — in Artifacts the Worker is thin auth + routing.

Durable Object per repo

  • Single-writer ordering. Concurrent pushes to the same repo serialise at the DO; no external lock service needed. Matches Git's own expectation that the server orders refs updates consistently.
  • Embedded SQLite storage. state.storage.kv (DO's sync KV API, SQLite-backed) holds git objects. Large objects chunked across rows because of the 2 MB row limit.
  • Delta + resolved-object side-by-side. From Cloudflare:

    "We avoid calculating our own git deltas, instead, the raw deltas and base hashes are persisted alongside the resolved object. On fetch, if the requesting client already has the base object, Zig emits the delta instead of the full object, which saves bandwidth and memory." Trades storage for bandwidth + runtime memory inside the ~128 MB DO envelope.

  • Streaming by default. ~128 MB memory forces streaming on both fetch and push paths: ReadableStream<Uint8Array> built directly from WASM output chunks.
  • Wasm protocol engine. See concepts/wasm-git-server — a ~100 KB pure-Zig binary, 11 host-imported storage functions + 1 streaming output function, testable in isolation.

R2

  • Pack-file snapshots — durable, replicated, egress-free.
  • Longer-tail blob storage outside the hot DO memory budget.

KV

  • Auth tokens — edge-replicated, low-latency read at the front-end Worker.
  • Short TTLs / revocation propagation plausible but not detailed in the post.

Properties this pattern gives you for free

  1. One-to-one semantics. Each repo is its own single-writer, isolated-state actor; no cross-repo blast radius from a misbehaving caller.
  2. Scale-to-zero economics. Hibernated DOs cost nothing; millions-of-repos is cost-bounded by storage + operations, not by active-compute instances.
  3. Per-repo metrics + per-repo lifecycle events drop out of the topology: aggregation / fan-out / event subscription can attach to the DO identity.
  4. Fork / copy-on-write becomes cheap.fork() can share immutable object storage and only diverge on new commits. (Artifacts supports .fork(name, {readOnly}) out of the box.)

When it applies

  • Storage unit has a natural caller-identified identity (repo id, tenant id, user id, session id). Artifacts' repo id is the identity.
  • Per-unit single-writer ordering is desired or required.
  • Per-unit state fits in the DO's memory + storage budget (with R2 for overflow) — tens of GB is fine, hundreds of TB is not.
  • Cost model should be massively-single-tenant — many units, most idle, few active at a time.

When it doesn't

  • Cross-unit transactions — DO is single-writer per DO, not across DOs. Multi-repo atomic operations need a different primitive.
  • Very large single units — one 10 TB Git repo doesn't fit the DO envelope; needs sharding inside or a different substrate.
  • Non-Cloudflare substrates — the specific economics (zero-idle DOs, R2 egress-free, edge KV) are what make the pattern compelling. The structural shape transfers to other actor-per-unit substrates (Orleans + SQL Azure, etc.) but the dollar-per-unit numbers won't.

Canonical pairing with patterns/git-protocol-as-api

The two patterns compose:

  • Git-protocol-as-API decides what the interface is (a Git remote URL that any client can speak to).
  • DO-backed Git server decides how to implement it (one DO per repo, Wasm engine, layered storage).

You can pick up the interface pattern without the substrate pattern (e.g. implement Git-protocol-as-API on Postgres + a managed Git library), and vice versa (e.g. per-DO SQLite sync / per-DO state machine), but the 2026-04 Artifacts launch is where both ship together.

Sibling instances in the 2026-04 Cloudflare arc

The same "one DO per caller-identified unit" substrate is the load-bearing primitive of:

  • Agent Lee — DO as credentialed proxy + elicitation-gate host.
  • Project Think — DO as agent-actor with fibers, sub-agents (Facets), sessions.
  • AI Search — DO not directly, but ai_search_namespaces instances are the retrieval-tier analogue (per-agent/per-customer runtime-provisioned instances with per-instance state).
  • Artifacts — DO as per-repo Git server + storage.

Four distinct products, same substrate shape. The pattern is generalisable.

Seen in

Last updated · 200 distilled / 1,178 read