Skip to content

SYSTEM Cited by 1 source

NVIDIA GB200 Grace Blackwell Superchip

The NVIDIA GB200 Grace Blackwell Superchip is NVIDIA's Blackwell-generation rack-scale AI platform, combining Blackwell GPUs with the Grace Arm-based CPU on a single module via NVLink-C2C. It is the silicon target that Meta's Catalina rack is built around at OCP Summit 2024.

Context on this wiki

The GB200 succeeds the NVIDIA H100 (Hopper) generation, which itself is the substrate under Meta's two 24K-GPU training clusters for Llama 3 (see sources/2024-06-12-meta-how-meta-trains-large-language-models-at-scale). The Blackwell generation's rack-scale posture — unified Grace-CPU + Blackwell-GPU packaging, higher per-rack GPU counts, liquid cooling required for sustained load — is what drove Meta's redesign to the 140 kW ORv3 HPR.

Seen in (wiki)

  • Meta Catalina (2024-10). Catalina is "based on the NVIDIA Blackwell platform full rack-scale solution, with a focus on modularity and flexibility. It is built to support the latest NVIDIA GB200 Grace Blackwell Superchip, ensuring it meets the growing demands of modern AI infrastructure." (Source: sources/2024-10-15-meta-metas-open-ai-hardware-vision)

Why it matters

  • Rack-scale unit, not chip-scale unit. Prior generations (H100) were reasoned about per-GPU and per-node; GB200 designs assume the rack as the unit of deployment, with a rack-level solution from NVIDIA.
  • Liquid cooling as table-stakes. GB200 racks' power envelope pushes past air-cooled limits; the generational shift to liquid-cooled infrastructure (Catalina at 140 kW) is a hardware-design forcing function.
  • Unified Arm-CPU + GPU. Grace-CPU + Blackwell-GPU on one Superchip changes the CPU/GPU trust and memory model relative to x86-host-plus-PCIe-GPU architectures.
Last updated · 319 distilled / 1,201 read