SYSTEM Cited by 1 source

TritonBench¶

Definition¶

TritonBench is Meta's open-source benchmark + evaluation harness for Triton GPU kernels at github.com/meta-pytorch/tritonbench. It "validates numerical correctness against PyTorch baselines and measures end-to-end speedup across production input shapes" (Source: sources/2026-04-02-meta-kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure).

Role in KernelEvolve¶

TritonBench is the correctness + speedup component of KernelEvolve's multi-layer evaluation framework:

TritonBench — numerical correctness (bitwise against PyTorch reference) + end-to-end speedup across production shapes.
PyTorch Profiler — system-level execution timelines.
NCU — GPU kernel-level hardware metrics.
Proton — intra-kernel instruction-level latency.
MTIA Insight — MTIA-specific accelerator counters.

Together these feed structured diagnostic signal back into the LLM synthesizer — canonical wiki instance of evaluation harness in agent loop.

Seen in¶

Meta KernelEvolve (2026-04-02, canonical). Named as the Triton-kernel correctness + speedup benchmark inside KernelEvolve's evaluator. (Source: sources/2026-04-02-meta-kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure)

Caveats¶

The 2026-04-02 post does not describe TritonBench's internal architecture (test-case format, baseline-comparison methodology, supported shapes) beyond the one-line description. Repository at github.com/meta-pytorch/tritonbench for details.

companies/meta
systems/kernelevolve — the agentic system that composes TritonBench into its automated evaluation framework.
systems/triton-dsl — the DSL TritonBench benchmarks kernels written in.
systems/kernelbench — Stanford's 250-problem benchmark suite; KernelEvolve hits 100% pass rate.
patterns/evaluation-harness-in-agent-loop — the pattern TritonBench is one layer of.

TritonBench¶

Definition¶

Role in KernelEvolve¶

Seen in¶

Caveats¶

Related¶