Skip to content

CONCEPT Cited by 3 sources

Bisection bandwidth

Definition

Bisection bandwidth is the aggregate bandwidth available across the narrowest cut that divides a network in half. For a non-blocking Clos fabric, it equals the sum of the downlink bandwidths of all the hosts on one side — i.e. the fabric is "full-bisection" when any half of the hosts can simultaneously communicate with the other half at their full link speed.

For AI training clusters, bisection bandwidth is the classical measure of whether the fabric can support arbitrary all-to-all communication patterns without becoming the bottleneck — which is exactly what 3D-parallelism collectives (AllReduce, AllGather, ReduceScatter) demand.

Meta's framing (2024)

The concept recurs across three Meta posts:

Why it matters

  • Defines whether collectives scale. All-reduce throughput at cluster scale is bounded by bisection bandwidth divided by the number of participants; under-provisioned bisection = under-utilised GPUs.
  • Non-blocking Clos is the stock answer, but expensive. Full-bisection at AI-cluster scale means a lot of switch silicon and a lot of cabling; Meta's AI Zone template instead provides non-blocking within the Zone and accepts oversubscription across Zones, compensated by topology-aware scheduling.
  • Pairs with injection bandwidth in lockstep. A fabric with high per-accelerator injection bandwidth but oversubscribed bisection chokes at collective time; Meta's 2024-10 forward-looking design targets both in lockstep.

Seen in

Last updated · 319 distilled / 1,201 read