META 2024-08-05 Tier 1

Meta — DCPerf: An open source benchmark suite for hyperscale compute applications¶

Summary¶

Meta open-sourced DCPerf — a benchmark suite where each benchmark is designed by referencing a large Meta production application, built to capture microarchitectural characteristics (IPC, core frequency, power) that existing benchmarks like SPEC CPU do not represent for hyperscale workloads. Meta uses DCPerf internally alongside SPEC CPU for capacity planning, early performance projection, hardware bug detection, and platform co-optimization with CPU vendors on pre-silicon / early-silicon — and over two years it was extended to x86 + ARM, chiplet-based architectures, and multi-tenant core-count scaling.

Key takeaways¶

Hyperscale datacenter workloads are a distinct market segment. "Workloads developed by large-scale internet companies running in their datacenters have very different characteristics than those in HPC or traditional enterprise market segments." Existing benchmarks "fall short of capturing these characteristics and hence do not provide a reliable avenue to design and optimize modern server and datacenter designs." (Source: original article)
Each DCPerf benchmark is anchored to a real Meta application. "Each benchmark within DCPerf is designed by referencing a large application within Meta's production server fleet." This is the canonical workload-representative-from-production pattern — the same design stance as Figma's custom OpenSearch Go harness (patterns/custom-benchmarking-harness) but on microarchitecture-vs-application axis rather than query-load axis.
Representativeness is measured at microarchitectural level. Meta publishes two comparison graphs: Instructions-Per-Cycle (IPC) and average core frequency — comparing production applications vs DCPerf vs SPEC CPU. In both, "Red circles highlight that DCPerf more accurately represents" the production values. concepts/benchmark-representativeness is canonicalised as a measurable property, not an assertion.
Multi-ISA: x86 + ARM. "Over the past few years, we have continuously enhanced these benchmarks to make them compatible with different instruction set architectures, including x86 and ARM." Cross-ISA validity is load-bearing for Meta because Meta operates both (contrast: systems/arm64-isa is the wiki's existing ARM64 reference, surfaced from the Go-compiler bug).
Emerging-trend coverage. DCPerf "can be used to evaluate emerging industry trends (e.g., chiplet-based architectures)" and "added support for multi-tenancy so that benchmarks can scale and make use of rapidly increasing core counts on modern server platforms." A benchmark suite has to evolve with the hardware it measures.
DCPerf is used alongside SPEC CPU, not instead of. "We have been using DCPerf internally, in addition to the SPEC CPU benchmark suite, for product evaluation at Meta." The industry- standard benchmark retains value; DCPerf adds the hyperscale-application-signal the industry-standard misses.
Five internal use cases named. (1) data-center deployment configuration choices, (2) early performance projections for capacity planning, (3) identifying performance bugs in hardware and system software, (4) joint platform optimization with hardware-industry collaborators, (5) deciding which platforms to deploy.
Pre-silicon / early-silicon partnership with CPU vendors. "Over the last two years we have collaborated with leading CPU vendors to further validate DCPerf on pre silicon and/or early silicon setups to debug performance issues and identify hardware and system software optimizations on their roadmap products." Canonical instance of patterns/pre-silicon-validation-partnership.
Ambition stated. "DCPerf has the potential to become an industry standard method to capture important workload characteristics of compute workloads that run in hyperscale datacenter deployments." Meta is inviting academia, the hardware industry, and internet companies to use it for design + evaluation of future products.

Systems extracted¶

systems/dcperf — the benchmark suite itself. Source of truth: github.com/facebookresearch/DCPerf.
systems/spec-cpu — the incumbent industry-standard CPU benchmark suite, named explicitly as the comparison point.
systems/arm64-isa — named as a target ISA DCPerf supports.

Concepts extracted¶

concepts/hyperscale-compute-workload — the distinct workload shape that motivated DCPerf: diverse application categories, memory + cache + branch behaviour unlike HPC or enterprise workloads, shaped by multi-tenant cloud deployment.
concepts/benchmark-representativeness — the measurable property DCPerf optimises for: IPC distribution + frequency distribution match between benchmark and production apps.
concepts/benchmark-methodology-bias — the existing wiki concept (Cloudflare Workers benchmark post) generalises; DCPerf is another instance of "existing benchmark biases away from the workload it's used to evaluate."

Patterns extracted¶

patterns/workload-representative-benchmark-from-production — reference a real production application when designing each benchmark; validate representativeness at microarchitectural level, not just at aggregate score.
patterns/pre-silicon-validation-partnership — ship workload-representative benchmarks to CPU vendors to debug + tune pre-silicon / early-silicon roadmap products collaboratively; Meta names this as a two-year investment with multiple optimization wins in core microarchitecture + SoC power.
patterns/custom-benchmarking-harness — adjacent wiki pattern (Figma). DCPerf is the hardware-evaluation-side realisation; Figma is the application-config-comparison-side realisation; both are "vendor default benchmarks don't match my workload, so I built a harness that does."

Numbers disclosed¶

0 production-scale numbers disclosed. The post is architectural / methodological, not a retrospective with quantified gains.
Visual-only comparison graphs: IPC and average core frequency across production apps / DCPerf / SPEC CPU.
Two-year duration of vendor-collaboration validation on pre-silicon / early-silicon.

Caveats¶

Announcement-voice post (open-sourcing event + methodology overview). Not a retrospective: no specific production workloads named, no per-benchmark constituent list, no vendor names in the collaboration, no quantified IPC / frequency delta numbers.
The comparison graphs are pictures not tables; the wiki cannot reproduce the exact IPC / frequency values.
"Representativeness" is asserted via two microarchitectural metrics (IPC, frequency). Cache-hit rate, branch-misprediction rate, memory-bandwidth consumption, TLB pressure are named as hyperscale-workload characteristics but not included in the published comparison.
Multi-tenancy support is "added" but the topology (how many tenants per benchmark, resource-isolation model) is not disclosed.
Chiplet-architecture evaluation is named as a use case; no specific chiplet design evaluated publicly.
The five internal use cases are stated as narrative; no per-use- case outcome numbers.
No version history / benchmark-suite composition disclosed in this post; GitHub repo would need to be inspected.

Source¶

Original: https://engineering.fb.com/2024/08/05/data-center-engineering/dcperf-open-source-benchmark-suite-for-hyperscale-compute-applications/
Raw markdown: raw/meta/2024-08-05-dcperf-an-open-source-benchmark-suite-for-hyperscale-compute-53f2ca8e.md
Open-source: github.com/facebookresearch/DCPerf
HN: item 41162576 (53 points)

systems/dcperf — the benchmark suite.
systems/spec-cpu — the industry-standard comparison point.
concepts/hyperscale-compute-workload — the workload shape DCPerf targets.
concepts/benchmark-representativeness — the property DCPerf optimises for.
concepts/benchmark-methodology-bias — the Cloudflare-framed sibling concept; DCPerf's IPC / frequency graphs argue that SPEC CPU is biased for hyperscale applications.
patterns/workload-representative-benchmark-from-production — the design rule DCPerf embodies.
patterns/pre-silicon-validation-partnership — the vendor- collaboration pattern.
patterns/custom-benchmarking-harness — the Figma sibling on the application-config axis.
companies/meta — Meta's engineering portfolio context.