Skip to content

SYSTEM Cited by 1 source

Apache Phoenix + Omid (Sparrow substrate)

Apache Phoenix is a SQL layer over HBase that provides JDBC-compatible SQL access, secondary indexes, and schema management on top of HBase's wide-column model. Apache Omid is a distributed-transactions layer for HBase that provides snapshot isolation across multi-row transactions using a Percolator-inspired protocol. Together — Phoenix + Omid — they form the open-source substrate that Pinterest's Sparrow transactional service was built on.

Definition

Apache Phoenix:

  • SQL-on-HBase: parses SQL, compiles to HBase scans and puts.
  • Maintains secondary indexes as separate HBase tables with app-level consistency.
  • JDBC driver for clients.

Apache Omid (Optimistically transaction Management for IncuBator):

  • Timestamp oracle assigns global transaction timestamps.
  • Commit table records transaction outcomes.
  • Lock cells on rows provide isolation.
  • Provides snapshot isolation across multi-row transactions — see concepts/snapshot-isolation.
  • Originally designed for HBase (Yahoo origin, now Apache).

Pinterest Sparrow

Sparrow is Pinterest's internal transactional DB built on top of Phoenix + Omid. The HBase deprecation retrospective names it but does not go into architecture:

"We also built Sparrow on top of Apache Phoenix Omid to support distributed transactions on top of HBase." (Source: sources/2024-05-14-pinterest-hbase-deprecation-at-pinterest)

The Sparrow internal design is not public beyond this mention.

Role in the HBase-deprecation case

Phoenix + Omid + Sparrow together are the canonical bolt-on for the distributed-transactions axis of missing-functionality. HBase does not provide multi-row atomicity natively; Omid layers it on via protocol-level lock cells and a timestamp oracle; Sparrow wraps this into a Pinterest-specific service.

This is the load-bearing instance of the "system complexity tax" axis 3 of patterns/nosql-to-newsql-deprecation:

  • HBase operational complexity: ~50 clusters, JVM, ZooKeeper, HMaster.
  • Plus Phoenix layer: SQL parser, query planner, secondary-index table maintenance.
  • Plus Omid layer: timestamp oracle bottleneck, commit table, lock-cell protocol.
  • Plus Sparrow layer: Pinterest-specific service layer on top.

Four layers where one NewSQL store (TiDB) provides the same capability natively.

Seen in

Last updated · 550 distilled / 1,221 read