Skip to content

PATTERN Cited by 1 source

Regional pre-warmed DO+Container pair pool

Problem

DO-enabled Containers place a Durable Object near the request, but the Container the DO connects to may spin up "on the other side of the world". For a chatty multi-message WebSocket interaction (e.g. headless-browser control: open page → navigate → wait → screenshot), every message between DO and Container pays the cross-region RTT — concepts/do-to-container-cross-region-rtt.

You want two placement constraints, simultaneously: DO near the user, and the Container near the DO. Without this, latency is unpredictable and dominated by which two regions happened to be picked.

Pattern

Pre-create DO+Container pairs in regional pools. Each pool holds N pairs whose DO and Container are colocated within a bounded geography (a region). On request:

  1. Identify the request's region (typically the POP serving the request).
  2. Pick a DO+Container pair from that region's pool that is "closest" to the user (within-region latency considerations).
  3. Bind the pair to the request; serve the request entirely within-region.

The DO+Container distance is bounded by the diameter of the region, never by the platform's global footprint.

                  request
                    |
                    v
           [region selector]
                    |
        +-----------+-----------+
        v                       v
   [region: weur pool]    [region: wnam pool]
   ┌──────────────────┐   ┌──────────────────┐
   │ DO ── Container  │   │ DO ── Container  │
   │ DO ── Container  │   │ DO ── Container  │
   │ DO ── Container  │   │ DO ── Container  │
   │   (pre-warmed)   │   │   (pre-warmed)   │
   └──────────────────┘   └──────────────────┘

Each pool is pre-warmed so that request-time work is "select + bind", not "create".

Verbatim canonical articulation

From the 2026-05-13 Browser Run migration post (Source: sources/2026-05-13-cloudflare-browser-run-now-running-on-cloudflare-containers-its-faster):

"Create regional pools of pre-warmed DO-backed browser containers to constrain the max distance (and hence max latency) between DOs and containers. When a request comes in, we pick a DO-container pair closest to the user within that region. This keeps latency low on both hops: user to DO, and DO to container. It adds a few more moving parts to our overall architecture, but we figured that was worthwhile so long as we had observability into the global state of each browser so that we could allocate and re-allocate capacity according to changing demand."

Two costs Cloudflare names explicitly: "more moving parts" + the need for "observability into the global state of each browser" to make capacity-rebalancing tractable.

Preconditions

  1. Workload is chatty — one-shot RPC workloads don't justify the complexity. Pay it only when N × cross-region-RTT is the dominant latency contributor.
  2. DO and Container must be co-deployable — the substrate must allow pinning a DO and a Container to the same region. DO-enabled Containers (Cloudflare's open-beta primitive) are the canonical wiki substrate.
  3. Request-region is identifiable cheaply at request time. On Cloudflare, the POP geography is implicit; on other substrates this needs explicit signalling.
  4. Capacity can be observed globally — the pool needs to know how many pre-warmed pairs each region has, and where demand is shifting, to rebalance.
  5. Cross-region capacity rebalancing is acceptable — pairs may need to be created in one region and torn down in another; the substrate's lifecycle primitives must support this.

When the pattern fits

  • Headless-browser control at edge scale (Browser Run is the canonical instance).
  • Stateful agent runtimes with chatty internal RPC (agent ↔ tool, agent ↔ memory).
  • Live-collaboration substrates where a coordination DO fans out work to a per-session Container (transcription, rendering pipelines).
  • Anywhere the concepts/do-to-container-cross-region-rtt asymmetry creates user-visible latency that single-RPC shapes (e.g. one-shot page render) don't expose.

When the pattern doesn't fit

  • One-shot workloads — control-plane ops, infrequent background tasks. The pool's idle-cost outweighs the bounding benefit.
  • Highly variable per-tenant lifecycle — if each tenant needs a tenant-specific Container image, pre-warming a uniform pool doesn't help (see Fly.io's Sprite design for why uniform images matter for warm pools).
  • Embarassingly small workload volumes — if you're running 10 requests/min total, the overhead of regional pools dwarfs the latency benefit.

Failure modes

  • Pool exhaustion under demand spike — a region's pool drains to zero; the next request faces cold-Container creation. Mitigated by demand forecasting + a backup-region fallback when the primary pool is empty (paying cross-region latency on the fallback path is the explicit trade).
  • Capacity-allocation stale state — the global view of "how many pairs in each region" must stay fresh enough for the placement decision; this is exactly the concepts/eventual-consistency-too-slow-for-allocation failure mode that drove Browser Run's KV → D1 migration.
  • Observability cost — keeping global per-pair state observable is its own engineering line item; without it the pattern degrades into "regional pools we don't know how to refill".
  • Region boundary defines a sharp cliff — a request at the edge of two regions may pay non-trivial latency depending on which it's routed to; the pattern doesn't gracefully degrade across region boundaries.
  • Pair lifecycle coupling — DO and Container are now bound for the pair's lifetime; killing one orphans the other and requires deliberate teardown semantics.

Composes with

  • patterns/single-http-request-over-chatty-websocket — reduces the multiplier on the bounded distance. Cloudflare applies both: regional pools bound the worst case; single-HTTP-request reduces N for the per-message hop count on quick-action paths.
  • concepts/warm-pool-instances — generalisation ("pre-create instances; serve create by dequeuing"). The DO+Container-pair pool is the two-primitive variant: what's pre-warmed is not a single VM but a paired DO+Container, with the placement constraint that they be colocated.
  • patterns/warm-pool-zero-create-path — earlier canonical instance at single-VM altitude (Fly.io Sprites); this pattern is the DO+Container-pair extension.

Seen in

  • sources/2026-05-13-cloudflare-browser-run-now-running-on-cloudflare-containers-its-fastercanonical wiki instance. Browser Run's screenshot path is WebSocket-based with dozens of messages per request; the uncoordinated DO+Container placement was driving user-visible latency. Regional pools of pre-warmed DO+Container pairs were the explicit architectural response. "60 browsers per minute via the Workers binding" and "120 concurrent — 4x the previous limit" are the headline outcomes (combined with the protocol-coalescing change).
Last updated · 542 distilled / 1,571 read