PATTERN Cited by 1 source

Pre-silicon validation partnership¶

Intent¶

Ship a workload-representative benchmark suite to CPU / SoC / accelerator vendors and collaborate with them on pre-silicon simulations and early-silicon bring-up — so that microarchitectural tuning + SoC-level optimizations on the vendor's roadmap products land matched to your production workload shape, not to the vendor's own synthetic benchmark suite.

Context¶

A hyperscaler's hardware procurement cycle spans years. By the time a vendor's chip is available and benchmarkable in production, tens of millions of dollars of design decisions have already been locked in. If those decisions were tuned against a benchmark that doesn't represent hyperscale workloads (see concepts/benchmark-methodology-bias), the chip will land with suboptimal performance on the workloads that drive capacity planning.

The fix is to get your workload-representative benchmark into the vendor's pre-silicon flow — architectural simulators, early silicon test chips, cycle-accurate models — so tuning happens against the right shape from the start.

Mechanism¶

Precondition — a workload-representative benchmark exists¶

You cannot run this pattern without a workload-representative benchmark that your vendor can share + execute. At Meta that artifact is DCPerf.

Two-phase collaboration¶

Pre-silicon. Run the benchmark against the vendor's architectural simulators + cycle-accurate models. Iterate on microarchitecture parameters (pipeline depths, cache sizes + hierarchy, branch-prediction structures, SoC power-management policies). "There have been multiple instances where we have been able to identify performance optimizations in areas such as CPU core microarchitecture settings and SOC power management optimizations." (Source: sources/2024-08-05-meta-dcperf-open-source-benchmark-suite)
Early-silicon. Run the benchmark on the first test chips. Catch performance bugs before mass production; catch system- software issues (firmware, driver, kernel, scheduler) before the chip lands in production data centers.

Duration¶

Meta reports two years of this collaboration cadence with "leading CPU vendors" across pre-silicon and/or early-silicon setups. It's not a one-shot engagement; it's a continuous partnership through the vendor's chip-design + bring-up timeline.

Bidirectional outcomes¶

Meta frames this collaboration as feeding optimizations in two directions: vendor ships a better chip for Meta's workload; Meta ships cleaner benchmarks + more characterization data back to the vendor. Both parties benefit; the benchmark is the common-language artifact.

Canonical instance — Meta + CPU vendors via DCPerf¶

Meta explicitly states:

"Over the last two years we have collaborated with leading CPU vendors to further validate DCPerf on pre silicon and/or early silicon setups to debug performance issues and identify hardware and system software optimizations on their roadmap products."

The wiki diagrams this as part of the same post (sources/2024-08-05-meta-dcperf-open-source-benchmark-suite): "areas of HW/SW design where we have seen DCPerf being representative of production usage and being beneficial for delivering relevant performance signals and help with optimizations."

Open-sourcing DCPerf expands the partnership from Meta↔vendors to any hyperscale-relevant organisation ↔ any vendor with access to the suite.

Why it works¶

Shifts tuning left. Optimization happens while design is still malleable, not after the chip is taped out.
Aligns vendor incentives with hyperscaler workload. The vendor's public SPEC numbers still matter for non-hyperscaler customers, but Meta-specific-optimization is earned through DCPerf, not assumed.
Catches bugs cheap. Performance / correctness issues found pre-silicon cost orders of magnitude less to fix than post-silicon.
Enables novel architectures. Chiplet, heterogeneous-core clusters, and mixed-ISA platforms are evaluated against representative workloads before mass deployment; Meta specifically names DCPerf validation on chiplet-based architectures.

Anti-patterns¶

Wait for generally-available silicon. Too late — the relevant design decisions have already been made against someone else's benchmark.
NDA-only benchmark. If the benchmark can't be shared in the vendor's simulator / early-silicon environment, the partnership can't run. Meta's answer is open-sourcing DCPerf outright.
Aggregate-score-only evaluation. A chip can score well on a benchmark while mispredicting the microarchitectural behaviour the benchmark is supposed to proxy; validate at the right level (see concepts/benchmark-representativeness).

Seen in¶

sources/2024-08-05-meta-dcperf-open-source-benchmark-suite — Meta's two-year DCPerf vendor-collaboration disclosure. Multiple identified optimizations in core microarchitecture and SoC power management.

systems/dcperf — the benchmark artifact this pattern depends on.
patterns/workload-representative-benchmark-from-production — the prerequisite pattern; you can't run pre-silicon validation without the benchmark.
concepts/benchmark-representativeness — the property the partnership exploits.
concepts/hyperscale-compute-workload — the workload shape vendors tune toward through this partnership.
companies/meta — canonical hyperscaler instance.