SYSTEM Cited by 1 source

NCU (NVIDIA Nsight Compute)¶

Definition¶

NCU (Nsight Compute) is NVIDIA's GPU kernel-level profiler — it reports per-kernel hardware metrics: occupancy, memory throughput, instruction mix, warp stall reasons, L1/L2 cache behavior, SM utilization, and more. Inside Meta's KernelEvolve it is one layer of a multi-tool evaluation framework that produces structured diagnostic signal (not just scalar wall-clock time) for the agent's next round of candidate generation (Source: sources/2026-04-02-meta-kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure).

Role in KernelEvolve¶

NCU is used to answer "why is this candidate slow?" — is it memory-bound, compute-bound, or occupancy-limited? — and the answer is fed back to the LLM synthesizer as part of its context-aware prompt for the next round of tree-search node expansions:

"The search engine doesn't just see 'kernel A is 1.2x faster than kernel B' — it sees why: whether the bottleneck is memory-bound, compute-bound, or limited by occupancy — and feeds that diagnostic signal back into the LLM synthesizer to guide the next round of candidates."

Canonical wiki datum for why NVIDIA's kernel-level profiler is a load-bearing input, not an optional debugging tool, at the agentic-kernel-optimization layer of the AI stack.

Seen in¶

Meta KernelEvolve (2026-04-02, canonical). Named as one of the GPU-side profilers in KernelEvolve's evaluation stack. (Source: sources/2026-04-02-meta-kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure)

Caveats¶

The 2026-04-02 post does not document which specific NCU metrics KernelEvolve reads (NCU has hundreds). Public NCU documentation is at developer.nvidia.com/nsight-compute.

companies/meta — the KernelEvolve deployment context.
systems/kernelevolve — the agentic kernel-synthesis system that consumes NCU output.
systems/proton-profiler — the complementary intra-kernel (instruction-level latency) profiler.
systems/tritonbench — the correctness + speedup layer; NCU is the hardware-metrics layer.
patterns/evaluation-harness-in-agent-loop — the pattern NCU is one component of.

NCU (NVIDIA Nsight Compute)¶

Definition¶

Role in KernelEvolve¶

Seen in¶

Caveats¶

Related¶