CONCEPT Cited by 1 source
Replica-cost tradeoff¶
Replica-cost tradeoff is the observation that replica count does not scale linearly with availability SLA — doubling replicas does not double durability or availability, but it does double storage cost and write-amplification. At PB-scale, the shape of this curve is often the dominant economic factor in a storage-engine choice.
Definition¶
A logical record R stored with N physical replicas has:
- Storage cost ≈ N × |R| (linear in replica count).
- Write amplification ≈ N (each write fans out to all replicas).
- Availability ≈ 1 − p^k where
pis per-replica failure probability andkis the quorum size. The gain fromN = 3toN = 6is measured in 9s of availability — much smaller than the gain fromN = 1toN = 3.
The tradeoff is sub-linear availability for linear cost — each additional replica buys progressively less marginal availability.
Canonical framing (Pinterest HBase → TiDB)¶
Pinterest's HBase deployment used primary + standby clusters each internally three-way replicated: 6 replicas per logical record. Pinterest observed that alternative stores (TiDB, MySQL, RocksDB-based KV) can hit comparable availability SLAs with 3 replicas given careful placement and replication mechanics:
"Production HBase clusters typically used a primary-standby setup with six data replicas for fast disaster recovery, which, however, came at an extremely high infra cost at our scale. Migrating HBase to other data stores with lower cost per unique data replica would present a huge opportunity of infra savings. For example, with careful replication and placement mechanisms, TiDB, Rockstore, or MySQL may use three replicas without sacrificing much on availability SLA." (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)
At Pinterest's peak scale (6 PB logical), the factor-of-two replica reduction is a multi-PB / multi-million-dollar line item — this is axis 4 of the five-reason NoSQL-to-NewSQL deprecation framework.
Why it works¶
- Independent failure domains saturate fast. Three replicas in three availability zones already cover most realistic failure modes. The marginal reliability from a fourth or fifth replica is small.
- Cost is almost perfectly linear. Storage, network egress, CPU for write-fanout all scale with replica count.
- Write-amp hits tail latency. More replicas means more fanout
before ack; even with tail-at-scale techniques, higher
Nmechanically raises write p99.
Tradeoffs¶
- The comparison is SLA-dependent. "Comparable availability SLA" from 6-replica to 3-replica assumes the 3-replica store has good placement + replication mechanics. If the 3-replica store is naively placed (e.g. all in one rack), it won't match the 6-replica shape. Pinterest's claim is "without sacrificing much on availability SLA" — a qualified claim, not a guarantee.
- DR scope matters. If the 6-replica shape protects against whole-region failure (standby in another region), a 3-replica shape in a single region is not equivalent — it's cheaper but narrower.
- Durability is use-case dependent. Not all data deserves the same replica count. See concepts/durability-as-use-case-dependent for PlanetScale's framing of this.
Seen in¶
- sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest — canonical wiki instance. Pinterest frames the 6-replica primary- standby as axis 4 of its HBase deprecation: "Migrating HBase to other data stores with lower cost per unique data replica would present a huge opportunity of infra savings." At 6 PB × 6 replicas = ~36 PB provisioned, halving the replica multiplier is a multi- million-dollar annual saving.
Related¶
- concepts/wal-replication — the mechanism behind Pinterest's cross-cluster replica count.
- concepts/primary-standby-failover — the failover shape that the 6-replica arrangement exists to support.
- concepts/availability-vs-data-loss-tradeoff — the broader RPO / RTO dial this sits on.
- concepts/durability-as-use-case-dependent — why not all data gets the same replica count.
- patterns/primary-standby-wal-replication — the 6-replica deployment shape.
- patterns/nosql-to-newsql-deprecation — the org-level move justified in part by this tradeoff.