Adopting AV1 for Real-Time Communication (RTC) at Scale¶
Summary¶
Meta describes its multi-year effort to deploy the AV1 video codec for real-time communication (RTC) across Messenger and WhatsApp, covering the full production stack: codec encoder/decoder selection for mobile power efficiency, ML-based device eligibility classification, runtime codec complexity adaptation (preset tuning + latency-aware codec switching + asymmetric send/receive codec), accurate VBV-based rate control preventing overshoot/undershoot, and error-resilience strategies (temporal layers + Long-Term Reference frames) for loss recovery without keyframe floods. AV1 is now enabled on the majority of mobile devices in Meta RTC applications.
Key Takeaways¶
-
20%+ bitrate reduction with AV1 vs H.264/AVC under product settings on low-end and mid-range devices — the fundamental motivation. At ≤100 kbps (common in emerging markets), AV1 remains visually clear while H.264 is noticeably blurry (Source: raw article intro).
-
A low-complexity AV1 encoder preset achieves power parity with H.264/AVC — enabling AV1 on mid-range and low-end phones. An off-the-shelf open-source AV1 encoder drew 14% more power on a Pixel 8; Meta's internal low-complexity encoder eliminates this gap entirely (Source: "Encoder and Decoder Selection" section).
-
dav1d selected as AV1 decoder after A/B testing among multiple open-source decoders — chosen for superior power efficiency and reliability, with measurable talk-time extension on mobile (Source: "Decoder Selection" section).
-
Binary size is a first-order deployment constraint at billion-user scale — 600 kB compressed (AV1 encoder + decoder) could consume an entire year's binary size budget for a large app; Meta pursued dynamic-download (unreliable), QM-table optimization (10% of encoder binary → halved), shared codec libraries, and platform codec reuse (Source: "Binary Size" section).
-
ML-based device eligibility framework replaces heuristic rules (memory/year/OS version proved insufficient). Collects low-level real-world performance metrics via logging pipeline → outputs an
rtc_scoreper device → determines AV1 capability. Iterative refinement (V1.1 → V2) with two-tier approach differentiating high-end vs low-end encoding capability (Source: "AV1 Device Eligibility" section). -
Three-layer codec complexity adaptation handles the reality that even 2023 smartphones throttle CPU during calls: (a) adaptive encoder preset adjustment monitoring encoding latency; (b) local encoding-latency-aware codec switch to H.264 if AV1 preset lowering is insufficient; (c) peer decoding-latency-aware codec switch via continuous feedback. Also considers battery level (Source: "Codec Complexity Adaptation" section).
-
Asymmetric codec design — mid-range devices that cannot encode AV1 in real-time can decode it, so they send H.264 but receive AV1 from high-end peers. Significantly increases AV1 coverage across the fleet (Source: "Asymmetric Codec Design" section).
-
VBV (Video Buffering Verifier) delay as rate-control accuracy metric — target <200 ms. Overshoot causes congestion + freeze; undershoot misleads bandwidth estimation and slows ramp-up. The encoder tracks VBV buffer status frame-by-frame, strictly limits keyframe bitrate, and compensates subsequent frames (Source: "Accurate Rate Control" section).
-
Reference Picture Resampling (RPR) — AV1 feature allowing resolution changes without generating a keyframe, significantly reducing bitrate spikes and video freeze during dynamic resolution adaptation (Source: "Rate Control Optimization" section).
-
Temporal Layers (TL) for error resilience — two-layer structure where base layer (TL0) maintains continuity without depending on enhancement layer. FEC protects base-layer only. TL enabled adaptively — turned on when loss rises, off when network recovers — because TL reduces compression efficiency under lossless conditions (Source: "Temporal Layer" section).
-
Long-Term Reference (LTR) frames with explicit RTP header extension indicators + frame_id ACK feedback. Two recovery paths: reactive (RPSI from receiver on freeze) and proactive (sender emits periodic LTRPs when elevated loss detected). LTR frames are combined with periodic higher-quality frames to mitigate temporal-correlation decay. The encoder maintains a bounded reference buffer of size 4 (Source: "Long-Term Reference" section).
-
Future work: group calls — decoding multiple AV1 streams simultaneously is harder than 1:1; hardware AV1 support across all device tiers is needed for quality improvement (Source: "Meta's Ongoing Journey With AV1" section).
Operational Numbers¶
| Metric | Value |
|---|---|
| AV1 vs H.264 bitrate reduction | ≥20% (offline tests, low/mid-range devices) |
| Open-source AV1 encoder power increase (Pixel 8) | 14% vs H.264 |
| Target VBV delay for RTC | <200 ms |
| RTC video bitrate range (emerging markets) | 10–400 kbps |
| Challenging quality threshold | <100 kbps |
| AV1 binary size addition (libAOM example) | 1.7 MB uncompressed / 600 kB compressed |
| QM tool share of encoder library size | ~10% |
| LTR reference buffer size | 4 |
| Acceptable end-to-end video latency | <300 ms |
Systems & Concepts Extracted¶
Systems¶
- systems/av1-codec — the codec being deployed
- systems/dav1d — open-source AV1 decoder selected for Meta RTC
- systems/messenger — Meta RTC application surface
- systems/whatsapp — Meta RTC application surface
- systems/meta-rtc-platform — Meta's real-time communication infrastructure (new)
Concepts¶
- concepts/rtc-codec-rate-control — VBV-based rate control for real-time video (new)
- concepts/vbv-delay — leaky-bucket metric for CBR accuracy (new)
- concepts/codec-complexity-adaptation — runtime adjustment of encoder preset/codec based on device capability (new)
- concepts/device-eligibility-ml — ML-based device capability classification (new)
- concepts/asymmetric-codec-design — different send/receive codecs exploiting encode/decode asymmetry (new)
- concepts/temporal-layer-error-resilience — temporal scalability for loss tolerance (new)
- concepts/long-term-reference-frame — pinned reference frames for fast loss recovery (new)
- concepts/reference-picture-resampling — resolution change without keyframe (new)
- concepts/binary-size-bloat — existing page, extended with AV1 mobile datum
Patterns¶
- patterns/ml-based-device-eligibility — use production telemetry + ML to classify device codec capability (new)
- patterns/adaptive-encoder-preset — monitor encoding latency → adjust complexity at runtime (new)
- patterns/latency-aware-codec-switching — switch codec based on encoding/decoding latency feedback (new)
- patterns/asymmetric-send-receive-codec — send lower-complexity codec, receive higher-complexity codec on weaker devices (new)
- patterns/adaptive-temporal-layer-activation — enable/disable TL based on network loss signal (new)
- patterns/ltr-proactive-and-reactive-recovery — dual-path LTR recovery: RPSI on freeze + periodic LTRP on elevated loss (new)
Caveats¶
- No specific numbers on AV1 fleet coverage percentage, call-quality improvement metrics, or A/B test results disclosed.
- ML model architecture (V1.1 / V2) not detailed beyond "uses low-level performance metrics."
- Exact encoder identity not disclosed (referred to as "internal low-complexity encoder").
- No timeline for group call AV1 deployment.
- Power consumption parity claim is for the internal encoder only — not a general AV1 property.
Source¶
- Original: https://engineering.fb.com/2026/06/22/video-engineering/adopting-av1-for-real-time-communication-rtc-meta/
- Raw markdown:
raw/meta/2026-06-22-adopting-av1-for-real-time-communication-rtc-at-scale-6122e204.md
Related¶
- sources/2024-06-13-meta-mlow-metas-low-bitrate-audio-codec — Meta's RTC audio codec (MLow); sibling on the RTC media axis
- sources/2026-04-09-meta-escaping-the-fork-webrtc-modernization — Meta's WebRTC modernization; shared RTC infrastructure substrate
- sources/2026-03-09-meta-ffmpeg-at-meta-media-processing-at-scale — Meta's video encoding infrastructure (VOD axis)
- systems/av1-codec — the codec's wiki page
- systems/dav1d — the decoder selected
- companies/meta