PATTERN Cited by 1 source

Session affinity for MCP SSE¶

Shape¶

When a multitenant MCP server fleet accepts long-lived Server-Sent-Events (SSE) connections from LLM clients, the routing tier must guarantee that every SSE connection from a given client lands on the same stateful MCP-server instance — the one holding that client's session state.

    Client A (LLM) ── SSE stream 1 ─┐
    Client A (LLM) ── SSE stream 2 ─┤──▶  MCP-server instance X  (holds A's state)
                                     │
    Client B (LLM) ── SSE stream 1 ──┼──▶  MCP-server instance Y  (holds B's state)
    Client B (LLM) ── SSE stream 2 ──┘

Without the affinity guarantee, a given client's subsequent connections may hit an instance without the state, forcing either session reconstitution (expensive, often impossible) or cross-instance state sharing (expensive at the store tier).

Why long-lived SSE changes the routing contract¶

Classic HTTP request/response MCP could fan out across any instance; each request was self-contained. Modern MCP flows (long-lived SSE) carry the session on the connection. Once the first SSE connection lands on instance X, instance X owns the session's in-memory state: tool-registration, subscription list, in-flight tool-invocation state, LLM-client context.

Any routing decision that sends a subsequent connection from the same client to a different instance breaks the contract.

Implementation¶

Tenant-controlled routing¶

Fly.io's specific answer is tenant-controlled dynamic request routing on Fly Proxy. The tenant's MCP server app can specify, per-connection, which Fly Machine should receive the request — based on header, path, or other routable attribute. The tenant's MCP server is free to hash, look up, or otherwise route the client to a specific Machine; Fly Proxy honours the decision.

Canonical Fly.io framing:

More recent MCP flows involve repeated and potentially long-lived (SSE) connections. To make this work in a multitenant environment, you want these connections to hit the same (stateful) instance. So we think it's possible that the control we give over request routing is a robot attractant.

(Source: sources/2025-04-08-flyio-our-best-customers-are-now-robots)

Header-based affinity¶

A common shape is: the MCP server identifies the client session on first connection, assigns it to an instance, and returns an affinity cookie / header. Subsequent connections include the header; the routing tier hashes it to the same instance.

Connection-level affinity¶

Some routing tiers pin at the TCP / TLS connection level — once a connection is established, all HTTP requests on that connection stay on the same backend. This is sufficient if the MCP client reuses connections; it's not sufficient if the MCP client opens fresh connections per SSE stream.

Persistent-instance-per-client¶

A structural alternative: give each client its own server-instance address, so affinity is structural rather than a routing-tier decision. This is the concepts/one-to-one-agent-instance|1:1 agent-to-instance shape from Cloudflare's Agents SDK backed by Durable Objects. Each agent is a DO; the DO address is the affinity key.

Three styles of "solve this problem"¶

Style	Routing primitive	State	Cost
Tenant-driven dynamic routing (Fly.io)	Header / path / cookie → specific Fly Machine	Per-instance in-memory	Tenant chooses
1:1 agent-to-DO (Cloudflare)	Agent ID = DO address	Per-DO, structural	Per-agent storage
Shared-store + any-instance (classic)	Round-robin	External store	Hot path through shared store

Fly.io's shape keeps the state cheap (in-memory per Machine) at the cost of routing complexity. Cloudflare's shape structurally eliminates the routing decision at the cost of a DO per agent. The classic shared-store shape is simplest for the routing tier but most expensive on every hot path.

Open questions¶

Failover / rebalancing. When the instance holding a client's session dies, the routing tier has to detect it and re-route. How cleanly this is handled is deployment- specific; Fly's post doesn't engage with it.
Connection migration. Can an SSE stream migrate to a different instance mid-flight without the client noticing? Generally no (the server state would have to migrate too), so clients typically reconnect.
Long-idle drops. Intermediate proxies (LBs, firewalls) may kill idle SSE connections. Affinity only matters if the connection stays up; a client that reconnects is fine as long as the reconnect hashes back to the same instance.
Multi-connection sessions. Some MCP flows open multiple SSE connections per session. Affinity has to map all of them to the same instance, typically via a session-ID-carrying header or cookie on each.

Known uses¶

Fly.io (2025-04-08) — tenant-controlled dynamic request routing on Fly Proxy; the canonical wiki instance for MCP SSE specifically.
Cloudflare Agents SDK — structural 1:1 via concepts/one-to-one-agent-instance; alternate shape for the same requirement.

concepts/mcp-long-lived-sse — the routing-contract requirement this pattern satisfies.
concepts/one-to-one-agent-instance — structural alternative.
concepts/robot-experience-rx — the product-design axis this pattern is an RX data point on.
systems/model-context-protocol — the protocol at issue.
systems/fly-proxy — Fly.io's routing tier.
systems/fly-machines — the stateful-instance substrate Fly.io pins sessions to.
patterns/session-affinity-header — related wiki pattern (prior instance: MongoDB predictive autoscaling for prompt-caching-aware routing).
companies/flyio — canonical wiki source.