Skip to content

PINTEREST Tier 2

Read original ↗

Pinterest — HBase Deprecation at Pinterest

Alberto Ordonez Pereira (Sr. Staff SWE) + Lianghong Xu (Sr. Manager, Engineering) publish Part 1 of a 3-part Pinterest Engineering series (2024-05-14) on the decade-long arc from HBase as Pinterest's default NoSQL store (2013) to a deliberate end-of-life decision (end of 2021) and a multi-year migration to a portfolio of workload-specific datastores fronted by a unified storage service (SDS). Part 1 is the retrospective + rationale; Part 2 (announced) covers the evaluation that selected TiDB; Part 3 covers the consolidated serving layer. This first post is the canonical "why we deprecated HBase" write-up on the wiki — no migration execution detail, no numbers on the new stack, but an unusually explicit five-reason framework for retiring a load-bearing NoSQL store at a company that ran "one of the largest production deployments of HBase in the world."

Summary

From its 2013 introduction, HBase became the foundational NoSQL storage backend at Pinterest and the substrate underneath a sprawling ecosystem of in-house services: graph service (Zen), wide-column store (UMS), monitoring storage (OpenTSDB), metrics reporting (Pinalytics), transactional DB (Omid/Sparrow), indexed datastore (Ixia), and a long tail of business-critical workloads — smartfeed, URL crawler, user messages, pinner notifications, ads indexing, shopping catalogs, Statsboard, experiment metrics. At peak HBase ran at ~50 clusters, ~9,000 AWS EC2 instances, and over 6 PB of data. Each production deployment used a primary + standby cluster inter-replicated via write-ahead logs (WALs) — primary served online requests, standby ran offline workflows + daily backups, with cluster-level failover on primary failure.

At the end of 2021 Pinterest decided to deprecate HBase across its entire production footprint. The post enumerates five reasons:

  1. High maintenance cost — Pinterest's HBase was ~5 years behind upstream, missing critical bug fixes and improvements. A previous upgrade (0.94 → 1.2) took "almost two years" due to a legacy build/deploy/provisioning pipeline + compatibility issues. HBase domain experts were increasingly hard to find; barriers to entry were high for new engineers.
  2. Missing functionality — HBase's relatively simple NoSQL interface couldn't meet evolving requirements: stronger consistency, distributed transactions, global secondary index, rich query. The Zen graph service suffered bugs and incidents specifically because HBase lacks distributed transactions — partial-failure updates left graphs in inconsistent states.
  3. High system complexity — advanced features were bolted on above HBase: Ixia (real-time indexing on HBase) + Manas realtime (global secondary indexing) + Sparrow (Apache Phoenix + Omid for distributed transactions). Each was significant development + maintenance load.
  4. High infra cost — primary-standby design meant 6 data replicas per logical record (3 in each cluster). Alternative stores could do 3 replicas without sacrificing availability SLA with careful placement + replication — "TiDB, Rockstore, or MySQL may use three replicas without sacrificing much on availability SLA". At Pinterest scale the factor-of-two replica reduction was a large infra-spend opportunity.
  5. Waning industry usage + community support — peer companies were migrating away; talent pool was shrinking; new engineers had low incentive to specialise in HBase.

The path to complete deprecation was unlocked because Pinterest had already started retiring HBase for specific workload axes when it realised HBase was not the optimal engine:

  • OLAP / online analyticsDruid + StarRocks. HBase "performed worse than state-of-the-art solutions for OLAP workloads."
  • Time-series dataGoku (in-house time-series datastore). HBase "was not able to keep up with the ever increasing time series data volume" → scalability, performance, and maintenance-load problems.
  • Key-valueKVStore (in-house KV built on RocksDB + Rocksplicator for real-time replication). HBase "was not as performant or infra efficient" for KV use cases.

These workload-specific migrations let Pinterest see that a single replacement for remaining HBase needed NoSQL scalability + RDBMS query power + ACID semantics. The evaluation outcome (Part 2) was TiDB — a distributed NewSQL database that "satisfied most of our requirements."

Key takeaways

  1. Pinterest ran one of the largest HBase deployments globally — and still deprecated it. "Pinterest hosted one of the largest production deployments of HBase in the world. At its peak usage, we had around 50 clusters, 9000 AWS EC2 instances, and over 6 PBs of data." The scale is load-bearing for the rest of the post: this isn't a small-team migration; it's a signal from someone with as much HBase operational experience as exists in industry. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  2. HBase's ecosystem at Pinterest was broader than HBase itself. Named HBase-backed services: Zen (graph), UMS (wide-column), OpenTSDB (monitoring), Pinalytics (metrics reporting), Omid/Sparrow (transactional DB), Ixia (indexed datastore). Business workloads: smartfeed, URL crawler, user messages, pinner notifications, ads indexing, shopping catalogs, Statsboard monitoring, experiment metrics. HBase deprecation is therefore organisationally a deprecation of an entire subsystem family — each service either migrates or rehomes. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  3. Standard Pinterest HBase production deployment = primary + standby inter-replicated via WAL, failover-ready. "A typical production deployment consists of a primary cluster and a standby cluster, inter-replicated between each other using write-ahead-logs (WALs) for extra availability. Online requests are routed to the primary cluster, while offline workflows and resource-intensive cluster operations (e.g., daily backups) are executed on the standby cluster. Upon failure of the primary cluster, a cluster-level failover is performed to switch the primary and standby clusters." Canonical WAL-replicated primary-standby instance on the wiki. Both clusters were three-way replicated → 6 replicas per record in total. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  4. The HBase-version-debt vicious cycle: outdated version → painful upgrade → more outdated. "Due to historical reasons, our HBase version was five years behind the upstream, missing critical bug fixes and improvements. Yet the HBase version upgrade is a slow and painful process due to a legacy build/deploy/provisioning pipeline and compatibility issues (the last upgrade from 0.94 to 1.2 took almost two years)." Canonical wiki instance of tech-debt compounding in a load-bearing storage system — the store gets more critical over time (more dependent services), the upgrade gets more painful (more services to test), the engineer pool shrinks (fewer experts join a legacy stack), and each year the next upgrade is harder than the last. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  5. Missing distributed transactions caused real production incidents in the graph service. "The lack of distributed transactions in HBase led to a number of bugs and incidents of Zen, our in-house graph service, because partially failed updates could leave a graph in an inconsistent state. Debugging such problems was usually difficult and time-consuming, causing frustration for service owners and their customers." Concrete named failure mode — graph updates touching multiple rows/edges can partially succeed, leaving graph consistency invariants violated. Canonical wiki example of missing distributed-transaction semantics leaking into the application layer as debugging burden. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  6. Bolt-on architecture → complexity tax. "To provide these advanced features for customers, we built several new services on top of HBase over the past few years. For example, we built Ixia on top of HBase and Manas realtime to support global secondary indexing in HBase. We also built Sparrow on top of Apache Phoenix Omid to support distributed transactions on top of HBase. While we had no better alternatives to satisfy the business requirements back then, these systems incurred significant development costs and increased the maintenance load." When the substrate lacks features, the organisation re-implements them in services above it; over time the services become the complexity. Every advanced-feature service is a separate codebase, separate on-call, separate upgrade cadence — all layered over the same underlying HBase. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  7. 6-replica primary-standby was durable but infra-expensive. "Production HBase clusters typically used a primary-standby setup with six data replicas for fast disaster recovery, which, however, came at an extremely high infra cost at our scale. Migrating HBase to other data stores with lower cost per unique data replica would present a huge opportunity of infra savings. For example, with careful replication and placement mechanisms, TiDB, Rockstore, or MySQL may use three replicas without sacrificing much on availability SLA." Canonical wiki statement of the replica-count vs availability-SLA trade-off2× replica count does not buy 2× availability, but it does buy 2× storage cost + 2× write-amplification. At 6 PB raw × 6 replicas = ~36 PB provisioned; halving that is a multi-PB / multi-million-dollar saving. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  8. Pinterest already had workload-specific migrations in flight before the formal HBase deprecation. OLAP → Druid + StarRocks; time-series → Goku; key-value → KVStore (on RocksDB + Rocksplicator). "In the past few years, several initiatives were started to replace HBase with more suitable technologies for these use case scenarios." Canonical wiki instance of the workload-specific datastore migration pattern — instead of swapping the substrate globally, decompose the workload by access pattern (OLAP / time-series / KV / transactional) and rehome each axis to a purpose-built store. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  9. The final slot — general NoSQL with transactions + rich query + secondary index — needed NewSQL. "To accommodate the remaining HBase use cases, we needed a new technology that offers great scalability like a NoSQL database while supporting powerful query capabilities and ACID semantics like a traditional RDBMS. We ended up choosing TiDB, a distributed NewSQL database that satisfied most of our requirements." TiDB collapses the Sparrow+Omid / Ixia+Manas stacks back into a single substrate with the missing features native. Canonical wiki statement of the NoSQL → NewSQL migration direction driven by requirement creep (consistency + transactions + secondary index + query). (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

  10. Industry signal reinforces the decision — HBase community was shrinking. "For the past few years, we have seen a seemingly steady decline in HBase usage and community activity in the industry, as many peer companies were looking for better alternatives to replace HBase in their production environments. This in turn has led to a shrinking talent pool, higher barrier to entry, and lower incentive for new engineers to become a subject matter expert of HBase." Community-health-as-deprecation-signal — a store's long-term viability isn't just its current behaviour but its upstream velocity + hiring market + peer-company adoption trajectory. (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

Architectural numbers

Datum Value Scope
HBase introduction year 2013 Pinterest's first NoSQL datastore
Peak HBase clusters ~50 Production, global
Peak EC2 instances ~9,000 All HBase clusters
Peak data volume >6 PB Logical (pre-replication)
Replicas per record 6 Primary-standby, 3-way each
HBase version lag behind upstream ~5 years At time of deprecation decision
Last HBase major upgrade duration ~2 years 0.94 → 1.2
Deprecation decision timing End of 2021 Formal go-ahead for full deprecation
Post publication date 2024-05-14 Part 1 of 3-part series
HN points 111 At publication

Systems introduced

  • systems/hbase — Apache HBase. Wide-column NoSQL store modelled after Google Bigtable; Pinterest's default NoSQL store 2013–2021; canonical wiki page for HBase.
  • systems/tidb — distributed NewSQL DB from PingCAP; the selected replacement for remaining HBase workloads; offers NoSQL-scale + RDBMS queries + ACID + secondary indexes.
  • systems/pinterest-zen — Pinterest's in-house graph service on HBase; bugs and incidents traced to missing distributed transactions.
  • systems/pinterest-ixia — Pinterest's indexed datastore built on HBase + Manas realtime for global secondary indexing; a bolt-on architecture to work around HBase's limited query capability.
  • systems/pinterest-kvstore — Pinterest's in-house KV store on RocksDB + Rocksplicator; replacement for HBase on KV workloads.
  • systems/rocksplicator — Pinterest's open-source real-time RocksDB data replicator; the replication substrate underneath KVStore.
  • systems/apache-phoenix-omid — Apache Phoenix (SQL on HBase) + Apache Omid (distributed-transactions layer); the stack Pinterest's Sparrow built on to give transactional semantics over HBase.
  • systems/opentsdb — time-series database written on top of HBase; Pinterest's monitoring-storage backend on HBase.
  • systems/pinterest-goku — Pinterest's in-house time-series datastore; replaced HBase for time-series workloads.
  • systems/apache-druid — (pre-existing wiki page) one of Pinterest's OLAP replacements for HBase; enriched at this page with Pinterest as a named adoption.
  • systems/starrocks — OLAP analytical DB; Pinterest's second OLAP replacement for HBase alongside Druid.
  • systems/pinterest-ums — Pinterest's in-house wide-column store on HBase; one of the original HBase-backed systems named in the ecosystem.

Systems reused / extended

  • systems/rocksdb — underlying KV engine of Pinterest's KVStore (post-HBase KV replacement).
  • systems/apache-druid — pre-existing wiki page; Pinterest named as an OLAP adopter replacing HBase.

Concepts introduced

  • concepts/wal-replication — Pinterest's primary-standby clusters are kept in sync via HBase write-ahead-log replication; canonical wiki instance of WAL-shipping inter-cluster replication (distinct from intra-cluster WAL for local crash recovery).
  • concepts/primary-standby-failover — two-cluster online/offline deployment with cluster-level failover on primary loss; canonical wiki instance distinct from per-node replica failover.
  • concepts/replica-cost-tradeoff — canonical wiki statement of 6-replica primary-standby vs 3-replica single-cluster as the durability/availability/cost trade-off; Pinterest cites multi-PB opportunity.
  • concepts/tech-debt-compounding — version lag → painful upgrade → more lag; canonical wiki instance of compounding tech debt in a load-bearing storage system.

Concepts reused / extended

  • concepts/nosql-database — extended with Pinterest's concrete scale datum (50 clusters / 9000 instances / 6 PB) and explicit deprecation rationale; the first canonical wiki write-up of a NoSQL store being retired at hyperscale.
  • concepts/distributed-transactions — new canonical wiki example of what "absence of distributed transactions" looks like in practice (Zen graph inconsistency incidents).

Patterns introduced

  • patterns/nosql-to-newsql-deprecation — explicit five-reason framework for retiring a NoSQL store: (1) maintenance cost, (2) missing functionality, (3) system complexity tax, (4) infra cost, (5) community health. Pinterest provides the canonical wiki instance.
  • patterns/primary-standby-wal-replication — two-cluster deployment, primary + standby, inter-replicated via WAL; primary serves online, standby serves offline + backups; cluster-level failover. Pinterest's HBase deployment as canonical wiki instance.
  • patterns/workload-specific-datastore-migration — instead of one-for-one substrate replacement, decompose workload by access pattern (OLAP / time-series / KV / transactional) and rehome each axis to a purpose-built store. Pinterest's pre-formal-deprecation migrations (Druid/StarRocks + Goku + KVStore) are the canonical wiki instance.

Caveats

  • Part 1 of a 3-part series — limited execution detail. This post is the retrospective + rationale. The evaluation methodology (Part 2) and the unified storage service / SDS consolidation (Part 3) are announced but not walked through. Quantitative wins (cost savings realised, incident rates, availability deltas, migration velocity) are not disclosed here.
  • No post-deprecation numbers. How many workloads migrated to TiDB vs KVStore vs Goku vs Druid/StarRocks isn't broken out. Data-volume distribution across the new stack isn't disclosed. Remaining HBase footprint at post time (2024-05) isn't stated.
  • Rationale framing is one-sided by construction. The post's job is to justify deprecation to internal + external audiences, not to weigh counterfactuals. Arguments for keeping HBase (operational know-how already invested, migration cost, two-platform transition period) are not given equal airtime.
  • Replica-count comparison is illustrative, not measured. "TiDB, Rockstore, or MySQL may use three replicas without sacrificing much on availability SLA" — Pinterest does not disclose the actual availability-SLA delta between 6-replica HBase and 3-replica TiDB as measured in their environment, nor whether the SLA means MTTR, p999 availability, or per-year 9s.
  • "Sparrow" is named but not defined. Pinterest's transactional service built on Apache Phoenix Omid; internal name not on public record beyond this post; no independent write-up cited.
  • Tech-debt vs version-lag framing conflates two things. The 0.94 → 1.2 two-year upgrade was partly a pipeline problem (legacy build/deploy/provisioning) and partly a compatibility problem (schema, API, config). The post does not separate these; future migrations at other companies should diagnose both independently.
  • "Industry decline" claim is not quantified. Named peer companies migrating away are not listed. HBase commits/releases/contributors numbers are not cited. Trajectory is asserted, not measured.
  • No SDS (Storage Data Service) architecture in Part 1. The unified storage service fronting the new stack is mentioned in passing — architecture is Part 3.
  • Pinterest's KVStore is named but not cited to a canonical URL. A 2022 Pinterest post (linked) describes KVStore; Rocksplicator is an open-source Pinterest project with a separate 2019 post. Neither is ingested on the wiki yet as of this ingest.
  • Ixia / Manas realtime distinction is compressed. Ixia is described as "built on top of HBase and Manas realtime"; the post doesn't separate Ixia (real-time indexing) from Manas realtime (the indexing substrate). A separate 2021 Ixia post is linked but not ingested.

Source

Last updated · 319 distilled / 1,201 read