Skip to content

CONCEPT Cited by 1 source

Write-path replication

Write-path replication is the ingest-side pattern where each write is replicated N-way across ingester/writer nodes before the write is considered durable and acknowledged to the client. It guarantees durability under single-node failure on the hot path, at the cost of N× compute, memory, and network on every ingest.

The Cortex-era pattern

Cortex — the common architectural ancestor of Mimir, Loki, and Pyroscope 1.x — replicates each incoming sample to typically three ingester replicas on the write path. Durability is guaranteed the moment a quorum of ingesters has the sample in memory; flushing to object storage happens later, asynchronously.

This made sense in early-2010s Prometheus-backend design: object storage latency was too high to sit on the write path, and the ingester tier was already the hot tier for recent-data queries. Write-path replication piggy-backed on the ingester tier that was already there for reads.

Why it's the cost being retired

At observability-DB scale, write-path replication becomes dominant:

  • Compute tripled. Three ingesters do the work of one.
  • Network tripled. Every write traverses the fabric N times.
  • Memory tripled. Each ingester buffers a full replica of recent data.
  • Scaling couples read capacity to write capacity. The ingester tier is dual-purpose, so scaling to handle a write spike scales the read path too (and vice versa).

For a continuous-profiling database where payloads are already heavy, the N× multiplier is especially painful.

The rearchitecture: push durability to object storage

Mimir's recent rearchitecture, and now Pyroscope 2.0, retire write-path replication. Durability is delegated to object storage (S3, GCS, Azure Blob), which is already N-way replicated internally. Per the Pyroscope 2.0 post:

"Mimir recently redesigned its architecture to eliminate write-path replication, decouple reads from writes, and make object storage the single source of truth. Pyroscope 2.0 applies similar architectural principles, adapted for the unique characteristics of profiling data."

(Source: sources/2026-04-22-grafana-introducing-pyroscope-2-0)

Under the new design:

  • Ingesters become stateless batch-buffers. Accept writes, batch them into blocks, commit blocks to object storage. No replication on the ingest side.
  • Readers query object storage directly. Independent tier, scaled independently.
  • Durability is object-storage durability (11 nines on S3, etc.) rather than in-memory-quorum durability.

See patterns/decouple-reads-from-writes-at-storage-layer and patterns/observability-db-rearchitecture-cortex-to-object-store for the pattern framing.

Tradeoffs

Eliminating write-path replication isn't free:

  • Write acknowledgment latency now depends on object-store PUT latency. Mitigated by batching writes into blocks and ACKing once the batch lands, but bursty-write workloads may see higher p99 ingest latency.
  • Recent data read path changes. The ingester tier no longer holds hot data for reads; queriers must read from object storage or from a separate recent-data cache. Design shifts the complexity from "replicas everywhere" to "tiered read path."

These tradeoffs are the engineering content of the rearchitectures; the payoff (large cost drop at observability-DB scale) justifies the complexity shift.

Last updated · 517 distilled / 1,221 read