Skip to content

CONCEPT Cited by 1 source

Dynamic concurrency control

Definition

Dynamic concurrency control is the practice of automatically tuning client-side parallelism (number of concurrent outstanding requests) based on real-time application-level congestion signals, rather than using a fixed or manually configured concurrency limit.

Motivation

During checkpoint events in AI training, hundreds of thousands of GPUs simultaneously load data, creating sharp egress spikes. Fixed concurrency limits either under-utilize capacity during normal operation or cause congestion/timeouts/retries during spikes. Dynamic control adapts in real time.

Mechanism at Meta

The BLOB-storage SDK implements dynamic concurrency control that:

  • Monitors application-level congestion signals (timeouts, elevated latencies, retry rates)
  • Reduces parallelism when congestion detected (backs off)
  • Increases parallelism when signals clear (ramps up)
  • Prevents the cascade: spike → congestion → timeout → retry → larger spike → GPU stall

(Source: sources/2026-07-01-meta-ai-storage-blueprint-at-scale, "Protocol Optimizations" section)

Relationship to Backpressure

Dynamic concurrency control is a client-side form of concepts/backpressure — the client self-regulates based on congestion signals rather than relying on the server to push back.

Seen in

Last updated · 567 distilled / 1,685 read