SYSTEM Cited by 3 sources
Redis¶
Redis is an in-memory data-structure store — key-value on the outside, but with first-class server-side support for lists, hashes, sorted sets, streams, and pub/sub. Persistence is optional (RDB snapshots + append-only log). Typically deployed as a cache, a fast serving tier for precomputed artifacts, a lightweight message broker, or a rate-limiter / counter store. Managed offerings (AWS ElastiCache, Google Memorystore, Redis Cloud) are the common deployment in production.
Properties relevant to system design¶
- Single-threaded command execution on the primary (client-facing) instance — atomicity for single commands; no in-process locking.
- Sub-millisecond in-memory reads when the dataset fits in RAM.
- Replication + cluster sharding for scale; read replicas for read fan-out.
- TTL on keys for cache-with-expiry as a first-class primitive.
- Not a source of truth. Durability is best-effort; treat as a cache / derived read model and keep the authoritative copy elsewhere.
Seen in¶
- sources/2025-12-18-mongodb-token-count-based-batching-faster-cheaper-embedding-inference —
Voyage AI by MongoDB uses Redis as the
queue substrate for token-
count-based batching on the query side of embedding inference.
Each request is enqueued on a Redis list with an attached
token_count; model servers run an atomic Lua script that pops items from the list until the total token count reaches the model-and-hardware-specific optimal batch size (~600 tokens for voyage-3 on A100), and sets per-item TTLs in the same single atomic call. Redis's single-threaded script execution guarantees no two model-server workers race on the same items. Canonical wiki instance of patterns/atomic-conditional-batch-claim. Caveat: "the probability of Redis losing data is very low. In the rare case that it does happen, users may receive 503 Service Unavailable errors and can simply retry" — Redis chosen specifically for its atomic-peek-and-claim Lua primitive, trading durability for the batching primitive RabbitMQ / Kafka don't natively offer. Enables Voyage AI's 50 % GPU- inference-latency reduction with 3× fewer GPUs on voyage-3-large. - sources/2024-12-10-canva-routing-print-orders — Canva Print Routing stores per-destination-region precomputed routing graphs in ElastiCache/Redis. 6 ms retrieval in most regions, 20 ms for largest; 99.999% availability (with read replicas). The routing graphs are async-rebuilt from a relational source of truth, so a Redis outage can be recovered from without data loss — the authority lives in the relational store.
- sources/2026-04-21-figma-figcache-next-generation-data-caching-platform — Figma FigCache fronts a fleet of ElastiCache Redis clusters with an in-house RESP-wire-protocol proxy. Context: at Figma scale, Redis evolved from a non-critical component into a critical-path dependency and its connection limits became load-bearing. Rapid client-fleet scale-ups triggered thundering herds of new connections that bottlenecked Redis I/O and degraded availability. Also: Redis Cluster's
CROSSSLOTerror on multi-key pipelines across hash slots is an application-visible footgun; FigCache's fanout engine transparently resolves read-only cases as parallel scatter-gather. Post-FigCache rollout, connection counts on Redis clusters dropped by an order of magnitude across the board and became much less volatile despite unchanged diurnal traffic patterns; node failovers / cluster scaling / transient connectivity errors were downgraded from high-sev incidents to zero-downtime background events. Shard failovers now run liberally and frequently across Figma's entire Redis footprint as live resiliency exercises.
Failure modes at scale (Figma FigCache retrospective)¶
The FigCache rearchitecture documents Redis failure modes worth naming:
- Connection-volume saturation. Even before reaching Redis's hard connection limit, growing fleet-wide connection counts degrade I/O throughput and increase tail latency.
- Thundering-herd on scale-up. Elastic client fleets open many new TCP+TLS connections simultaneously; the handshake burst bottlenecks Redis for existing clients.
- Client-ecosystem fragmentation. Different client libraries have inconsistent Redis Cluster awareness, retry/timeout behavior, and observability — making fleet-wide guarantees about client- side state correctness during failovers impossible.
The canonical remedy is a stateless proxy tier in front of Redis that performs concepts/connection-multiplexing. See systems/figcache.
Related¶
- aws-elasticache — AWS's managed Redis/Memcached
- systems/figcache — Figma's in-house proxy tier in front of ElastiCache Redis
- patterns/caching-proxy-tier — the architectural pattern for fronting Redis at scale
- concepts/connection-multiplexing — the reason to put a proxy there
- systems/canva-print-routing
- patterns/async-projected-read-model
- patterns/atomic-conditional-batch-claim — Redis + Lua as the canonical native substrate for peek + atomic-claim-up-to-budget batching
- concepts/token-count-based-batching — Voyage AI's application of the pattern to GPU embedding inference
- systems/voyage-ai, systems/vllm — Voyage AI's serving stack consuming batches from Redis