SYSTEM Cited by 1 source

TAOBench¶

What it is¶

TAOBench is an open-source benchmark for relational and distributed databases that synthesises the workload shape of Meta's production TAO social-graph store. It was published at VLDB 2022 by Audrey Cheng and colleagues at UC Berkeley in collaboration with Meta engineers (paper: TAOBench: An End-to-End Benchmark for Social Network Workloads); the preceding VLDB 2021 paper ("Workload Analysis of a Large-Scale Key-Value Store") characterised Meta's TAO workload and motivated the benchmark.

Why it shows up on this wiki¶

TAOBench is introduced on the wiki via PlanetScale's Tech Solutions post TAOBench: Running social media workloads on PlanetScale (Liz van Dijk, 2022-09-08), which positions it as a social-graph-shaped complement to TPC-C / sysbench-tpcc for evaluating database substrates under workloads TPC-C doesn't cover: "The TPC-C benchmark has had a very long life, and has remained remarkably relevant until this day, but there are scenarios it doesn't cover. Audrey Cheng and her team at University of California, Berkeley identified a real gap when it comes to available synthetic benchmarks for a more recent, but highly pervasive workload type: social media networks."

Schema¶

Two tables — canonicalised as concepts/social-graph-objects-and-edges:

objects — the social-graph entities (users, posts, pictures, comments, pages).
edges — the many-to-many relations between entities (likes, shares, friendships, follows, reactions). The edges table is a classic many-to-many junction linking objects rows to other objects rows.

"In simple relational database terms: The edges table can be viewed as a 'many-to-many' relationship table that links rows in objects to other rows in objects." (Source: sources/2026-04-21-planetscale-taobench-running-social-media-workloads-on-planetscale.)

This is the relational encoding of the social graph — distinct from Meta's graph-native TAO API (objects + associations as first-class API primitives).

Workload profiles¶

Two pre-configured workload scenarios ship with TAOBench:

Workload A (Application) — transactional subset of the queries; concentrates on the OLTP-like access patterns within Meta's workload.
Workload O (Overall) — generalised profile of the full TAO workload.

Critically, the statistical distribution of data in both objects and edges is baked into the load phase, not just the query phase: "data should be reloaded when switching between them" (Source: sources/2026-04-21-planetscale-taobench-running-social-media-workloads-on-planetscale). This is a stronger representation-of-Meta coupling than sysbench-tpcc's Lua-script-level workload shape — the benchmark knows about the workload's storage shape, not just its query mix.

Three-phase protocol¶

TAOBench runs in three explicit phases:

Load phase — bulk-insert rows into objects and edges according to the chosen workload scenario. Populates the dataset to the size dictated by the workload profile.
Bulk-reads phase (unmeasured) — "very aggressive range scans across the entire dataset to serve as general 'warmup' to whichever caching mechanisms may be in place, and also aggregates the necessary statistical information to feed into the experiments themselves." Explicitly "not measured, but can be extremely punishing to the underlying infrastructure."
Experiments phase (measured) — accepts predefined concurrency levels and runtime operation targets to scale the chosen workload to various infrastructure sizes.

The separation of warmup into its own unmeasured phase is a methodology improvement over single-phase benchmarks that conflate cold-cache ramp with measured steady-state. The bulk-reads phase tests range-scan capacity, which is a different substrate axis than the experiments phase's concurrency-driven point-op load — two substrate axes are exercised in one benchmark run.

Hot-row / thundering-herd as design target¶

TAOBench's objects + edges model is chosen in part to explicitly stress hot-row behaviour and thundering-herd response. Van Dijk's framing: "Focusing the workload around these two simplified concepts allows the benchmark to simulate typical 'hot row' scenarios that can be particularly challenging for relational databases to handle. Think of what happens when something goes viral: a thundering herd of users comes through to interact with a specific piece of content posted somewhere."

This makes TAOBench the first named benchmark on this wiki that explicitly measures substrate behaviour under viral-content skew — distinct from sysbench-tpcc, whose access patterns are shard-key-aligned (i.e., no hot rows by construction).

Positioning vs sysbench ¶

Axis	`sysbench-tpcc`	TAOBench
Workload shape	OLTP / online-retail-adjacent	Social graph
Schema	TPC-C derivative (warehouses, districts, customers, orders)	`objects` + `edges` (many-to-many)
Access pattern	Shard-key-aligned; no hot rows	Skewed; explicit hot-row stress
Workload reloads	Not required between scenarios	Required between Workload A / O
Warmup	Implicit in ramp	Explicit unmeasured bulk-reads phase
Origin	TPC-C academic benchmark + Percona port	Meta-workload-derived, VLDB-published
Use on PlanetScale	1M-QPS single-tenant capability	48-core multi-tenant-serverless capability

The two benchmarks are intentionally complementary in PlanetScale's published benchmarking work — sysbench-tpcc for shard-linear scaling demonstration, TAOBench for social-graph-shaped substrate maturity disclosure.

Seen in¶

sources/2026-04-21-planetscale-taobench-running-social-media-workloads-on-planetscale — Liz van Dijk (PlanetScale, 2022-09-08) introduces TAOBench to the PlanetScale benchmarking arsenal. Cheng's Berkeley/Meta team independently measured PlanetScale infrastructure against TAOBench; PlanetScale then verified internally using the public benchmark code. The published PlanetScale run uses a 48-CPU core resource cap allocated as **44 cores for the query path
4 cores for multi-tenant serverless overhead (edge load balancers) — see concepts/constrained-resource-benchmark for the methodology generalisation. Key takeaway van Dijk names is graceful saturation, not peak QPS: "sustained stability of PlanetScale clusters under even the most extreme resource pressure" — the benchmark-at-the-ceiling property that distinguishes mature substrates from ones that collapse past 100% CPU (concepts/graceful-saturation-vs-congestive-collapse).

systems/meta-tao — the production social-graph store TAOBench models.
systems/sysbench — the OLTP-shaped complement for PlanetScale benchmarking.
systems/planetscale, systems/vitess, systems/mysql — the substrate under test in the published results.
concepts/social-graph-objects-and-edges — the schema this benchmark canonicalises.
concepts/hot-row-problem, concepts/thundering-herd — explicit workload-design targets.
concepts/benchmark-representativeness — the representativeness axis TAOBench argues via its VLDB characterisation work.
concepts/constrained-resource-benchmark — the 48-core-cap methodology shape.
concepts/graceful-saturation-vs-congestive-collapse — the substrate property the PlanetScale run highlights.
patterns/reproducible-benchmark-publication — open-source benchmark code + independent measurement.