Skip to content

SYSTEM Cited by 1 source

Google Cloud Bigtable

Google Cloud Bigtable is Google's managed wide-column NoSQL store — the external-facing product surfacing the design of the Bigtable paper (2006), HBase's design ancestor. Row-key-addressed, time-ordered-per-column-family, with range-scan as a first-class read — which makes it a natural fit for workloads where the read shape is "scan everything with row-key prefix P" or "items with timestamp ≥ T".

Pattern of appearance on this wiki

Bigtable shows up on the wiki as the cross-cloud V1 CDC changelog store for Segment's objects pipeline (pre-2024-08-01). Segment chose Bigtable because its ergonomic range-scan-by-row-key semantics fit the changelog query "items modified since T" — a time-ordered row key like <shard>#<timestamp>#<id> turns the query into a cheap range scan. Quoted from Segment's 2024-08-01 post: "In the V1 system, we used BigTable as our changelog as it suited our requirements quite well. It provided low-latency read and write access to data, which made it suitable for real-time applications." (Source: sources/2024-08-01-segment-0-6m-year-savings-by-using-s3-for-change-data-capture-for-dynamodb)

Why Bigtable was replaced in Segment's V2

The same ergonomics that made Bigtable fit the V1 read pattern did not justify the cross-cloud operational cost at Segment's scale: the base store was DynamoDB on AWS, while Bigtable is a GCP service, so every changelog read paid cross-cloud egress + multi- cloud operational overhead. V2 migrated the changelog to Amazon S3 — trading the low-latency random-access profile of Bigtable for the cheaper per-byte economics + same-cloud locality of S3 — saving ~$0.6M/year and eliminating one cloud boundary. Fits the wiki's cross-cloud cost consolidation framing.

Seen in

Last updated · 470 distilled / 1,213 read