Skip to content

SYSTEM Cited by 1 source

Llama 3

Llama 3 is Meta's April-2024 open-weights foundation-model release (8B, 70B; 405B followed as Llama 3.1). For this wiki, the operationally interesting fact is that Llama 3 was trained on both of Meta's 24K-GPU H100 clusters simultaneously: one using RoCE, one using InfiniBand. The largest Llama 3 model was trained on the RoCE cluster.

"We used both InfiniBand and RoCE clusters to train Llama 3, with the RoCE cluster used for training the largest model." (Source: sources/2024-06-12-meta-how-meta-trains-large-language-models-at-scale)

This is a production-AI-fabric-comparison datum — rare in the open literature, and the reason Meta's 2024-06-12 post is architecturally significant.

Seen in (wiki)

Relation to Llama 3.1

Llama 3.1 is the July-2024 update to the Llama 3 family (8B / 70B / 405B), already documented elsewhere on this wiki as the adaptation base for eBay's e-Llama and Instacart's SRL model. The 2024-06-12 Meta infra post predates Llama 3.1's public release; the infra substrate described there is the substrate Llama 3.1 was subsequently trained on (Meta has not published a separate infrastructure retrospective for 3.1 at the time of this wiki entry).

Relationship:

  • Llama 3 — April 2024, 8B + 70B release.
  • Llama 3.1 — July 2024 update, 8B + 70B + 405B; same infra family.

Stub

More content to add as Meta publishes further infra retrospectives (ingest candidates: the Building Meta's GenAI Infrastructure 2024-03-12 post, Llama 3.1 model-card disclosures, the Llama 3 Herd paper).

Last updated · 319 distilled / 1,201 read