SYSTEM Cited by 1 source
TritonBench¶
Definition¶
TritonBench is Meta's open-source benchmark + evaluation harness for Triton GPU kernels at github.com/meta-pytorch/tritonbench. It "validates numerical correctness against PyTorch baselines and measures end-to-end speedup across production input shapes" (Source: sources/2026-04-02-meta-kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure).
Role in KernelEvolve¶
TritonBench is the correctness + speedup component of KernelEvolve's multi-layer evaluation framework:
- TritonBench — numerical correctness (bitwise against PyTorch reference) + end-to-end speedup across production shapes.
- PyTorch Profiler — system-level execution timelines.
- NCU — GPU kernel-level hardware metrics.
- Proton — intra-kernel instruction-level latency.
- MTIA Insight — MTIA-specific accelerator counters.
Together these feed structured diagnostic signal back into the LLM synthesizer — canonical wiki instance of evaluation harness in agent loop.
Seen in¶
- Meta KernelEvolve (2026-04-02, canonical). Named as the Triton-kernel correctness + speedup benchmark inside KernelEvolve's evaluator. (Source: sources/2026-04-02-meta-kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure)
Caveats¶
The 2026-04-02 post does not describe TritonBench's internal architecture (test-case format, baseline-comparison methodology, supported shapes) beyond the one-line description. Repository at github.com/meta-pytorch/tritonbench for details.
Related¶
- companies/meta
- systems/kernelevolve — the agentic system that composes TritonBench into its automated evaluation framework.
- systems/triton-dsl — the DSL TritonBench benchmarks kernels written in.
- systems/kernelbench — Stanford's 250-problem benchmark suite; KernelEvolve hits 100% pass rate.
- patterns/evaluation-harness-in-agent-loop — the pattern TritonBench is one layer of.