CONCEPT Cited by 1 source
Dynamic concurrency control¶
Definition¶
Dynamic concurrency control is the practice of automatically tuning client-side parallelism (number of concurrent outstanding requests) based on real-time application-level congestion signals, rather than using a fixed or manually configured concurrency limit.
Motivation¶
During checkpoint events in AI training, hundreds of thousands of GPUs simultaneously load data, creating sharp egress spikes. Fixed concurrency limits either under-utilize capacity during normal operation or cause congestion/timeouts/retries during spikes. Dynamic control adapts in real time.
Mechanism at Meta¶
The BLOB-storage SDK implements dynamic concurrency control that:
- Monitors application-level congestion signals (timeouts, elevated latencies, retry rates)
- Reduces parallelism when congestion detected (backs off)
- Increases parallelism when signals clear (ramps up)
- Prevents the cascade: spike → congestion → timeout → retry → larger spike → GPU stall
(Source: sources/2026-07-01-meta-ai-storage-blueprint-at-scale, "Protocol Optimizations" section)
Relationship to Backpressure¶
Dynamic concurrency control is a client-side form of concepts/backpressure — the client self-regulates based on congestion signals rather than relying on the server to push back.
Seen in¶
- sources/2026-07-01-meta-ai-storage-blueprint-at-scale — egress spike management for AI checkpoint loading