PATTERN Cited by 1 source
Pre-silicon validation partnership¶
Intent¶
Ship a workload-representative benchmark suite to CPU / SoC / accelerator vendors and collaborate with them on pre-silicon simulations and early-silicon bring-up — so that microarchitectural tuning + SoC-level optimizations on the vendor's roadmap products land matched to your production workload shape, not to the vendor's own synthetic benchmark suite.
Context¶
A hyperscaler's hardware procurement cycle spans years. By the time a vendor's chip is available and benchmarkable in production, tens of millions of dollars of design decisions have already been locked in. If those decisions were tuned against a benchmark that doesn't represent hyperscale workloads (see concepts/benchmark-methodology-bias), the chip will land with suboptimal performance on the workloads that drive capacity planning.
The fix is to get your workload-representative benchmark into the vendor's pre-silicon flow — architectural simulators, early silicon test chips, cycle-accurate models — so tuning happens against the right shape from the start.
Mechanism¶
Precondition — a workload-representative benchmark exists¶
You cannot run this pattern without a workload-representative benchmark that your vendor can share + execute. At Meta that artifact is DCPerf.
Two-phase collaboration¶
-
Pre-silicon. Run the benchmark against the vendor's architectural simulators + cycle-accurate models. Iterate on microarchitecture parameters (pipeline depths, cache sizes + hierarchy, branch-prediction structures, SoC power-management policies). "There have been multiple instances where we have been able to identify performance optimizations in areas such as CPU core microarchitecture settings and SOC power management optimizations." (Source: sources/2024-08-05-meta-dcperf-open-source-benchmark-suite)
-
Early-silicon. Run the benchmark on the first test chips. Catch performance bugs before mass production; catch system- software issues (firmware, driver, kernel, scheduler) before the chip lands in production data centers.
Duration¶
Meta reports two years of this collaboration cadence with "leading CPU vendors" across pre-silicon and/or early-silicon setups. It's not a one-shot engagement; it's a continuous partnership through the vendor's chip-design + bring-up timeline.
Bidirectional outcomes¶
Meta frames this collaboration as feeding optimizations in two directions: vendor ships a better chip for Meta's workload; Meta ships cleaner benchmarks + more characterization data back to the vendor. Both parties benefit; the benchmark is the common-language artifact.
Canonical instance — Meta + CPU vendors via DCPerf¶
Meta explicitly states:
"Over the last two years we have collaborated with leading CPU vendors to further validate DCPerf on pre silicon and/or early silicon setups to debug performance issues and identify hardware and system software optimizations on their roadmap products."
The wiki diagrams this as part of the same post (sources/2024-08-05-meta-dcperf-open-source-benchmark-suite): "areas of HW/SW design where we have seen DCPerf being representative of production usage and being beneficial for delivering relevant performance signals and help with optimizations."
Open-sourcing DCPerf expands the partnership from Meta↔vendors to any hyperscale-relevant organisation ↔ any vendor with access to the suite.
Why it works¶
- Shifts tuning left. Optimization happens while design is still malleable, not after the chip is taped out.
- Aligns vendor incentives with hyperscaler workload. The vendor's public SPEC numbers still matter for non-hyperscaler customers, but Meta-specific-optimization is earned through DCPerf, not assumed.
- Catches bugs cheap. Performance / correctness issues found pre-silicon cost orders of magnitude less to fix than post-silicon.
- Enables novel architectures. Chiplet, heterogeneous-core clusters, and mixed-ISA platforms are evaluated against representative workloads before mass deployment; Meta specifically names DCPerf validation on chiplet-based architectures.
Anti-patterns¶
- Wait for generally-available silicon. Too late — the relevant design decisions have already been made against someone else's benchmark.
- NDA-only benchmark. If the benchmark can't be shared in the vendor's simulator / early-silicon environment, the partnership can't run. Meta's answer is open-sourcing DCPerf outright.
- Aggregate-score-only evaluation. A chip can score well on a benchmark while mispredicting the microarchitectural behaviour the benchmark is supposed to proxy; validate at the right level (see concepts/benchmark-representativeness).
Seen in¶
- sources/2024-08-05-meta-dcperf-open-source-benchmark-suite — Meta's two-year DCPerf vendor-collaboration disclosure. Multiple identified optimizations in core microarchitecture and SoC power management.
Related¶
- systems/dcperf — the benchmark artifact this pattern depends on.
- patterns/workload-representative-benchmark-from-production — the prerequisite pattern; you can't run pre-silicon validation without the benchmark.
- concepts/benchmark-representativeness — the property the partnership exploits.
- concepts/hyperscale-compute-workload — the workload shape vendors tune toward through this partnership.
- companies/meta — canonical hyperscaler instance.