Meta — Meta's open AI hardware vision¶
Summary¶
Meta's 2024-10-15 post — timed to the Open Compute Project (OCP) Global Summit 2024 — announces the next generation of Meta's AI-hardware stack and contributes the designs to OCP. Four headline disclosures: (1) Catalina, a new high-powered AI rack built on the NVIDIA Blackwell platform (GB200 Grace Blackwell Superchip), using the OCP ORv3 high-power rack (HPR) capable of up to 140 kW, liquid-cooled; (2) Grand Teton expanded to support AMD Instinct MI300X and contributed to OCP; (3) Disaggregated Scheduled Fabric (DSF) — Meta's vendor-agnostic AI networking backend, powered by OCP-SAI + FBOSS + Ethernet/RoCE, enabling multi-vendor endpoint/NIC/accelerator integration; plus new 51T fabric switches on Broadcom/Cisco ASICs and a first FBNIC module with Meta-designed network ASIC; (4) Mount Diablo, a Meta/Microsoft co-developed disaggregated power rack with a 400 VDC scalable unit. The post also projects forward: Meta anticipates injection bandwidth of ~1 TB/s per accelerator and matching normalized bisection bandwidth — more than an order-of-magnitude growth over today's AI fabrics.
Key takeaways¶
- Scaling trajectory named explicitly — an order-of-magnitude on network bandwidth is coming. "In the next few years, we anticipate greater injection bandwidth on the order of a terabyte per second, per accelerator, with equal normalized bisection bandwidth. This represents a growth of more than an order of magnitude compared to today's networks!" Meta frames the supporting requirement: "a high-performance, multi-tier, non-blocking network fabric that can utilize modern congestion control to behave predictably under heavy load." This is the forward projection under which Catalina + DSF + FBNIC + 51T switches are being designed. (Source text)
- Cluster scale already at 24K × 2 and growing. "[Llama 3.1 405B] pushed our infrastructure to operate across more than 16,000 NVIDIA H100 GPUs… Today, we're training our models on two 24K-GPU clusters. We don't expect this upward trajectory for AI clusters to slow down any time soon." The post re-anchors the Meta training substrate from sources/2024-06-12-meta-how-meta-trains-large-language-models-at-scale (two 24K-GPU H100 clusters, Grand Teton @ 700 W air-cooled) as the previous generation. Catalina is the next-step platform. (Source text; see sources/2024-06-12-meta-how-meta-trains-large-language-models-at-scale)
- Catalina — 140 kW liquid-cooled ORv3 on NVIDIA GB200 Blackwell. "With Catalina we're introducing the ORv3, a high-power rack (HPR) capable of supporting up to 140kW. The full solution is liquid cooled and consists of a power shelf that supports a compute tray, switch tray, the ORv3 HPR, the Wedge 400 fabric switch, a management switch, battery backup unit, and a rack management controller." Built on the NVIDIA Blackwell platform full rack-scale solution, supporting the NVIDIA GB200 Grace Blackwell Superchip. Modularity and flexibility are the stated design principles — "to empower others to customize the rack to meet their specific AI workloads while leveraging both existing and emerging industry standards." Major shift from the Grand-Teton-@-700W-air-cooled approach; see systems/catalina-rack. (Source text)
- Grand Teton expanded to AMD MI300X — monolithic platform principle preserved. Meta's 2022-era Grand Teton AI platform (successor to Zion-EX, designed for DLRM + content-understanding workloads) gets a new variant supporting the AMD Instinct MI300X accelerator and the design is contributed to OCP. "Like its predecessors, this new version of Grand Teton features a single monolithic system design with fully integrated power, control, compute, and fabric interfaces. This high level of integration simplifies system deployment, enabling rapid scaling with increased reliability for large-scale AI inference workloads." Grand Teton is now a multi-accelerator platform (NVIDIA H100 + AMD MI300X); see patterns/modular-rack-for-multi-accelerator. (Source text)
- Disaggregated Scheduled Fabric (DSF) — vendor-agnostic AI backend. "Developing open, vendor-agnostic networking backend is going to play an important role going forward… Disaggregating our network allows us to work with vendors from across the industry to design systems that are innovative as well as scalable, flexible, and efficient." DSF "offers several advantages over our existing switches. By opening up our network fabric we can overcome limitations in scale, component supply options, and power density." Powered by OCP-SAI + FBOSS (Meta's own network operating system since 2018) + Ethernet RoCE to endpoints. Multi-vendor NIC/GPU support: NVIDIA + Broadcom + AMD named explicitly. See systems/meta-dsf-disaggregated-scheduled-fabric + concepts/network-fabric-disaggregation. (Source text)
- 51T fabric switches + FBNIC. "We have also developed and built new 51T fabric switches based on Broadcom and Cisco ASICs. Finally, we are sharing our new FBNIC, a new NIC module that contains our first Meta-design network ASIC." Silicon-level response to the projected TB/s-per-accelerator bandwidth. FBNIC is Meta's first in-house network ASIC — vertical integration step analogous to the server/rack self-design lineage (OCP, Grand Teton). (Source text)
- Mount Diablo (with Microsoft) — disaggregated 400 VDC power rack. "Our current collaboration focuses on Mount Diablo, a new disaggregated power rack. It's a cutting-edge solution featuring a scalable 400 VDC unit that enhances efficiency and scalability. This innovative design allows more AI accelerators per IT rack, significantly advancing AI infrastructure." Disaggregates the power rack from the IT rack — the same architectural stance as DSF at the network level, applied to power delivery. Higher voltage (400 VDC vs conventional 48 VDC OCP) reduces current / copper / losses for the same kW. See systems/mount-diablo-power-rack + concepts/400-vdc-rack-power. (Source text)
- The open-hardware thesis stated explicitly. "Scaling AI at this speed requires open hardware solutions… By investing in open hardware, we unlock AI's full potential and propel ongoing innovation in the field." And later: "We also need open AI hardware systems. These systems are necessary for delivering the kind of high-performance, cost-effective, and adaptable infrastructure necessary for AI advancement." Meta positions OCP-contribution as the natural consequence of the scaling curve — closed-source hardware cannot keep pace. Canonical patterns/open-hardware-for-ai-scaling. (Source text)
- Meta × Microsoft OCP lineage named. "Meta and Microsoft have a long-standing partnership within OCP, beginning with the development of the Switch Abstraction Interface (SAI) for data centers in 2018." Other joint contributions: Open Accelerator Module (OAM) standard, SSD standardization, and now Mount Diablo. See patterns/co-design-with-ocp-partners. (Source text)
Systems / hardware extracted¶
- systems/catalina-rack — Meta's new liquid-cooled 140 kW ORv3 high-power rack on NVIDIA GB200 Blackwell; modular + flexible.
- systems/orv3-rack — the Open Rack v3 HPR variant (up to 140 kW) being contributed to OCP via Catalina.
- systems/nvidia-gb200-grace-blackwell — NVIDIA's Blackwell-generation Grace+Blackwell Superchip, the silicon Catalina is built around.
- systems/amd-instinct-mi300x — AMD's flagship data-center GPU; now supported on Grand Teton.
- systems/meta-dsf-disaggregated-scheduled-fabric — Meta's open vendor-agnostic AI backend fabric. OCP-SAI + FBOSS + Ethernet/RoCE.
- systems/fboss-meta-network-os — Meta's network operating system (2018).
- systems/ocp-sai — the Switch Abstraction Interface open standard co-developed by Meta and Microsoft for OCP in 2018.
- systems/fbnic — Meta's first in-house network ASIC module.
- systems/mount-diablo-power-rack — Meta × Microsoft OCP-contributed disaggregated 400 VDC power rack.
- systems/meta-wedge-400 — the Meta Wedge 400 fabric switch used in Catalina.
- systems/oam-open-accelerator-module — OCP's accelerator-module standard.
- systems/grand-teton — extended: now supports AMD MI300X in addition to NVIDIA H100.
- systems/nvidia-h100 — the prior-generation accelerator supported on Grand Teton (2024-06-12 clusters).
- systems/roce-rdma-over-converged-ethernet — the Ethernet-RDMA fabric DSF is built on.
- systems/llama-3-1 — 405B training referenced as the 16K-H100 anchor that kicked off the next scaling step.
Concepts extracted¶
- concepts/network-fabric-disaggregation — the architectural stance of splitting a vertically-integrated fabric into open, vendor-replaceable layers.
- concepts/liquid-cooled-ai-rack — liquid cooling as the enabler of > 100 kW rack-level power density.
- concepts/injection-bandwidth-ai-cluster — per-accelerator network bandwidth, projected to reach ~1 TB/s.
- concepts/bisection-bandwidth — the classical HPC-networking measurement; projected to scale in lockstep with injection BW.
- concepts/400-vdc-rack-power — DC power delivery at 400 V to reduce copper / current / loss at high rack kW.
Existing concepts reinforced:
- concepts/rack-level-power-density — Catalina's 140 kW ORv3 extends the upper bound disclosed on the wiki (Dropbox's 7th-gen sits at ~16 kW/rack air-cooled; Catalina at 140 kW liquid-cooled is an order-of-magnitude-plus delta).
Patterns extracted¶
- patterns/open-hardware-for-ai-scaling — Meta's thesis: AI scale requires the hardware layer to move at the pace of the software layer, which requires open-source contribution rather than vendor-locked designs.
- patterns/modular-rack-for-multi-accelerator — Grand Teton's "single monolithic system design with fully integrated power, control, compute, and fabric interfaces" extended across NVIDIA + AMD accelerators; Catalina extending the pattern to GB200.
- patterns/co-design-with-ocp-partners — Meta × Microsoft lineage (SAI 2018 → OAM → Mount Diablo 2024) as the operational model.
Operational numbers¶
- Catalina rack power: up to 140 kW (ORv3 HPR), liquid-cooled.
- Mount Diablo: 400 VDC scalable unit.
- Fabric switches: 51 Tbps on Broadcom + Cisco ASICs.
- Projected per-accelerator injection bandwidth: ~1 TB/s.
- Projected bisection bandwidth: "equal normalized" to injection — i.e. non-oversubscribed at the fleet level.
- Current training scale anchors: Llama 3.1 405B at > 16,000 H100 GPUs on 15T tokens; two concurrent 24,000-GPU training clusters today.
- Growth target: > 10× bandwidth scale vs today's networks.
Caveats¶
- Announcement voice, not retrospective. The post is keyed to OCP Summit 2024; it announces designs rather than reports on operational production experience. No Catalina production numbers, no DSF deployment scale, no FBNIC silicon perf data.
- Open-source release timing not fully specified — the post says "upcoming release" for Catalina; Grand-Teton-with-MI300X is being contributed to OCP; exact availability dates not disclosed.
- No disclosure of Catalina GPU count per rack — the "single monolithic system design" principle for the prior Grand Teton is preserved in Catalina's Blackwell variant, but the exact compute-tray count + GPU count per rack is not given.
- FBNIC feature set not disclosed. Meta names it as "first Meta-design network ASIC" — packet-processing feature set, software offload model, pipeline depth, or any perf data are not in this post.
- Mount Diablo deployment timing not given. The collaboration is described as "current" but power-rack availability dates are not disclosed.
- No comparison of Catalina vs Grand-Teton-H100 TCO, no disclosure of how Catalina's 140 kW + liquid cooling change data-center facility design (CRAH, CDU, manifold density). Implicit: they change everything.
- The "growth of more than an order of magnitude" is the post's own forward projection, not a published roadmap milestone.
- Llama 3.1 405B training was on 16K H100s per this post, consistent with the 2024-06-12 post's two 24K-H100 clusters (training uses a subset of a cluster); this is not a contradiction, it's Meta disclosing the subset-scale for one specific training run.
Related wiki pages¶
- companies/meta — parent company page.
- sources/2024-06-12-meta-how-meta-trains-large-language-models-at-scale — the 2024-06 training-substrate post this 2024-10 post succeeds. That post is the current state (Grand Teton @ 700 W H100 air-cooled); this post is the next state (Catalina @ 140 kW GB200 liquid-cooled).
- sources/2024-08-05-meta-a-roce-network-for-distributed-ai-training-at-scale — the SIGCOMM-2024 RoCE deep-dive. DSF is the next step past the 24K-GPU RoCE cluster's fabric.
- sources/2025-08-08-dropbox-seventh-generation-server-hardware — the direct prior-art concepts/rack-level-power-density datum on the wiki (16 kW air-cooled); Catalina's 140 kW liquid-cooled is the hyperscaler-scale counterpart.