Skip to content

PATTERN Cited by 2 sources

Caching proxy tier

Interpose a stateless proxy tier speaking the cache's native wire protocol between applications and the underlying cache fleet (Redis, Memcached, etc.), pushing cross-cutting responsibilities — connection pooling, traffic routing, protocol-semantics-aware guardrails, observability, cluster-topology handling — out of application code and into a platform-owned data plane.

Canonical shape

  • Data plane: proxy fleet, stateless, horizontally scalable. Speaks the cache's native wire protocol in both directions (inbound from clients, outbound to cache).
  • Clients: thin first-party wrappers over existing OSS cache clients, pointed at the proxy endpoint (drop-in — one-line endpoint change when done right, see patterns/protocol-compatible-drop-in-proxy).
  • Upstream: one or more cache clusters (possibly multi-cluster, multi-backend, multi-region) — the proxy hides the fanout.
  • Control plane: routing configuration, policy, rollout feature- flags. Authored as code/config; deployed into the proxy at runtime (see patterns/starlark-configuration-dsl for an expressive option).
  • Observability substrate: every command crosses the proxy → uniform metrics / logs / traces regardless of client language or version.

Responsibilities the proxy tier absorbs

  • Connection multiplexing — decouples client-fleet elasticity from backend connection load. Primary architectural driver at scale.
  • Traffic routing across multiple clusters / backends. Engine- tree or rule-based routing (by command, key prefix, key shape) replaces per-application cluster-endpoint configuration.
  • Topology-change absorption. Node failovers, cluster scaling, transient connectivity errors handled at the proxy; clients see zero-downtime events.
  • Command-semantics-aware guardrails. Command-type allowlists / denylists, key-prefix rejection, QoS / priority backpressure, multi-upstream traffic mirroring, distributed locks as custom commands.
  • Inline data transformation. Encryption / compression applied transparently without every client library implementing them.
  • Uniform observability. Availability / throughput / latency / payload size / command cardinality / connection distribution per command, sliced by workload metadata attached at the routing tier.
  • Cluster-mode emulation / protocol translation. Let clients with heterogeneous cluster-awareness connect uniformly.

Why build it

Three structural pressures make this worth the proxy-hop cost:

  • Redis (or cache-of-choice) connection limits at scale.
  • Fragmented client ecosystems — multiple languages × versions × cluster-awareness × TLS configs — producing inconsistent observability and correctness during failovers.
  • Platform-scope concerns (encryption / routing / QoS / observability) that otherwise recur in every application.

Existing remedies — per-app client-side connection pooling, manual Redis dependency removal on hot paths — isolate symptoms locally but don't close the structural gap.

Trade-offs

  • Latency penalty from the extra network hop + proxy I/O. Must be controlled: benchmark-suite + golden-baseline CI gates + zonal traffic colocation (patterns/zone-affinity-routing applied to the client → proxy hop) + production stress tests at multiples of organic peak.
  • Reliability of the proxy tier is now load-bearing. Must be stateless, horizontally scalable, deployed across multiple AZs. Feature-flag gates for reversible rollout per workload domain.
  • Build-vs-buy pressure. OSS proxies exist (Twemproxy, Envoy w/ Redis filter, Dynomite, various sharded-proxy libraries). Build in-house when the OSS proxies' RPC layer can't extract full structured arguments (blocking semantics-aware guardrails / custom commands) OR when your client ecosystem's fragmentation demands non-standard protocol shims OR when maintaining a fork on top of upstream is logistically brittle.

Rollout strategy (battle-tested shape)

  1. Migrate clients to first-party wrappers — interface- compatible over existing OSS clients, no protocol change, no endpoint change. Earns uniform observability + config guardrails ahead of the proxy.
  2. Productionize the proxy independently — tackle scalability / reliability / operability / observability without application churn.
  3. Cut applications over to the proxy gradually, reversibly, feature-flag-gated, incrementally per workload domain (never all-or-nothing for large services).

Caveats

  • Pipelining correctness is non-trivial when multiplexing. Transactions / pub-sub / blocking commands pin outbound connections; the proxy must track and respect that.
  • Cross-shard operations (multi-key pipelines on cluster-mode Redis) require explicit handling — FigCache's fanout engine resolves read-only cases transparently as parallel scatter- gather; writes usually surface as errors to clients.
  • "Platform" responsibility transfers — caching becomes a platform concern with an SLO, not a per-app concern. Staffing and on-call model must follow.

Seen in

  • sources/2026-04-21-figma-figcache-next-generation-data-caching-platformFigCache is the canonical instantiation: stateless RESP-wire-protocol proxy between Figma's apps and AWS ElastiCache Redis clusters. Six-nines uptime post-rollout; order-of-magnitude drop in Redis connection counts; diagnosis time hours/days → minutes; cluster operational events (failovers, scaling, transient errors) degraded from high-sev incidents to zero-downtime background ops.
  • sources/2024-02-15-flyio-globally-distributed-object-storage-with-tigrisobject-storage variant, shape-adjacent not identical. Tigris's byte-cache layer (Fly.io NVMe volumes per region) sits in front of a distributed byte store and serves regional reads locally, matching the "proxy tier in front of upstream storage" contour. But it differs from the Redis/cache variant on coherence: the NVMe-cached copy is a first-class replica discoverable via FoundationDB metadata, not a TTL-expiring cache entry. Fly.io calls this out explicitly — "Tigris isn't a CDN, but rather a toolset that you can use to build arbitrary CDNs, with consistency guarantees, instant purge and relay regions." Useful data point: the cache-tier shape generalises across storage layers (Redis, object storage), but the cache-vs-replica distinction matters and scales with how critical strong consistency is. See patterns/metadata-db-plus-object-cache-tier for the full three-layer pattern this fits inside.
Last updated · 200 distilled / 1,178 read