SYSTEM Cited by 1 source
TAOBench¶
What it is¶
TAOBench is an open-source benchmark for relational and distributed databases that synthesises the workload shape of Meta's production TAO social-graph store. It was published at VLDB 2022 by Audrey Cheng and colleagues at UC Berkeley in collaboration with Meta engineers (paper: TAOBench: An End-to-End Benchmark for Social Network Workloads); the preceding VLDB 2021 paper ("Workload Analysis of a Large-Scale Key-Value Store") characterised Meta's TAO workload and motivated the benchmark.
Why it shows up on this wiki¶
TAOBench is introduced on the wiki via PlanetScale's Tech
Solutions post TAOBench: Running social media workloads on PlanetScale
(Liz van Dijk, 2022-09-08), which positions it as a social-graph-shaped
complement to TPC-C / sysbench-tpcc for evaluating database
substrates under workloads TPC-C doesn't cover: "The TPC-C benchmark
has had a very long life, and has remained remarkably relevant until
this day, but there are scenarios it doesn't cover. Audrey Cheng and
her team at University of California, Berkeley identified a real gap
when it comes to available synthetic benchmarks for a more recent,
but highly pervasive workload type: social media networks."
Schema¶
Two tables — canonicalised as concepts/social-graph-objects-and-edges:
objects— the social-graph entities (users, posts, pictures, comments, pages).edges— the many-to-many relations between entities (likes, shares, friendships, follows, reactions). Theedgestable is a classic many-to-many junction linkingobjectsrows to otherobjectsrows.
"In simple relational database terms: The edges table can be
viewed as a 'many-to-many' relationship table that links rows in
objects to other rows in objects."
(Source: sources/2026-04-21-planetscale-taobench-running-social-media-workloads-on-planetscale.)
This is the relational encoding of the social graph — distinct from Meta's graph-native TAO API (objects + associations as first-class API primitives).
Workload profiles¶
Two pre-configured workload scenarios ship with TAOBench:
- Workload A (Application) — transactional subset of the queries; concentrates on the OLTP-like access patterns within Meta's workload.
- Workload O (Overall) — generalised profile of the full TAO workload.
Critically, the statistical distribution of data in both objects
and edges is baked into the load phase, not just the query
phase: "data should be reloaded when switching between them"
(Source: sources/2026-04-21-planetscale-taobench-running-social-media-workloads-on-planetscale).
This is a stronger representation-of-Meta coupling than
sysbench-tpcc's Lua-script-level workload shape — the benchmark
knows about the workload's storage shape, not just its query mix.
Three-phase protocol¶
TAOBench runs in three explicit phases:
- Load phase — bulk-insert rows into
objectsandedgesaccording to the chosen workload scenario. Populates the dataset to the size dictated by the workload profile. - Bulk-reads phase (unmeasured) — "very aggressive range scans across the entire dataset to serve as general 'warmup' to whichever caching mechanisms may be in place, and also aggregates the necessary statistical information to feed into the experiments themselves." Explicitly "not measured, but can be extremely punishing to the underlying infrastructure."
- Experiments phase (measured) — accepts predefined concurrency levels and runtime operation targets to scale the chosen workload to various infrastructure sizes.
The separation of warmup into its own unmeasured phase is a methodology improvement over single-phase benchmarks that conflate cold-cache ramp with measured steady-state. The bulk-reads phase tests range-scan capacity, which is a different substrate axis than the experiments phase's concurrency-driven point-op load — two substrate axes are exercised in one benchmark run.
Hot-row / thundering-herd as design target¶
TAOBench's objects + edges model is chosen in part to
explicitly stress hot-row
behaviour and thundering-herd
response. Van Dijk's framing: "Focusing the workload around these
two simplified concepts allows the benchmark to simulate typical
'hot row' scenarios that can be particularly challenging for
relational databases to handle. Think of what happens when
something goes viral: a thundering herd of users comes through
to interact with a specific piece of content posted somewhere."
This makes TAOBench the first named benchmark on this wiki that
explicitly measures substrate behaviour under viral-content skew
— distinct from sysbench-tpcc, whose access patterns are
shard-key-aligned (i.e., no hot rows by construction).
Positioning vs sysbench¶
| Axis | sysbench-tpcc |
TAOBench |
|---|---|---|
| Workload shape | OLTP / online-retail-adjacent | Social graph |
| Schema | TPC-C derivative (warehouses, districts, customers, orders) | objects + edges (many-to-many) |
| Access pattern | Shard-key-aligned; no hot rows | Skewed; explicit hot-row stress |
| Workload reloads | Not required between scenarios | Required between Workload A / O |
| Warmup | Implicit in ramp | Explicit unmeasured bulk-reads phase |
| Origin | TPC-C academic benchmark + Percona port | Meta-workload-derived, VLDB-published |
| Use on PlanetScale | 1M-QPS single-tenant capability | 48-core multi-tenant-serverless capability |
The two benchmarks are intentionally complementary in PlanetScale's
published benchmarking work — sysbench-tpcc for shard-linear
scaling demonstration, TAOBench for social-graph-shaped substrate
maturity disclosure.
Seen in¶
- sources/2026-04-21-planetscale-taobench-running-social-media-workloads-on-planetscale — Liz van Dijk (PlanetScale, 2022-09-08) introduces TAOBench to the PlanetScale benchmarking arsenal. Cheng's Berkeley/Meta team independently measured PlanetScale infrastructure against TAOBench; PlanetScale then verified internally using the public benchmark code. The published PlanetScale run uses a 48-CPU core resource cap allocated as **44 cores for the query path
- 4 cores for multi-tenant serverless overhead (edge load balancers) — see concepts/constrained-resource-benchmark for the methodology generalisation. Key takeaway van Dijk names is graceful saturation, not peak QPS: "sustained stability of PlanetScale clusters under even the most extreme resource pressure" — the benchmark-at-the-ceiling property that distinguishes mature substrates from ones that collapse past 100% CPU (concepts/graceful-saturation-vs-congestive-collapse).
Related¶
- systems/meta-tao — the production social-graph store TAOBench models.
- systems/sysbench — the OLTP-shaped complement for PlanetScale benchmarking.
- systems/planetscale, systems/vitess, systems/mysql — the substrate under test in the published results.
- concepts/social-graph-objects-and-edges — the schema this benchmark canonicalises.
- concepts/hot-row-problem, concepts/thundering-herd — explicit workload-design targets.
- concepts/benchmark-representativeness — the representativeness axis TAOBench argues via its VLDB characterisation work.
- concepts/constrained-resource-benchmark — the 48-core-cap methodology shape.
- concepts/graceful-saturation-vs-congestive-collapse — the substrate property the PlanetScale run highlights.
- patterns/reproducible-benchmark-publication — open-source benchmark code + independent measurement.