Skip to content

SYSTEM Cited by 1 source

quiche

quiche (github.com/cloudflare/quiche) is Cloudflare's open-source implementation of QUIC and HTTP/3, written in Rust. It is production-deployed across Cloudflare's edge as the QUIC / HTTP/3 library in the critical path "for a significant share of the traffic we serve" (Source: sources/2026-05-12-cloudflare-when-idle-isnt-idle-how-a-linux-kernel-optimization-became-a-quic-bug).

What it is

  • Rust library, not a service. Embedders include the Cloudflare edge proxy stack (sibling to Pingora), standalone servers, and Cloudflare's HTTP/3 client tooling. The library implements the IETF QUIC + HTTP/3 wire protocols above UDP.
  • User-space congestion control. Because QUIC runs in user space (unlike kernel TCP), the congestion controller is part of the library, not the OS. quiche's default CCA is CUBIC; it also ships Reno for comparison and a BBRv3 implementation "now enabled for a growing percentage of our QUIC deployments." Canonical wiki instance of concepts/user-space-congestion-control compounded with modular pluggable CCA (swap CUBIC for BBRv3 or Reno without kernel changes).
  • qlog as the built-in observability substrate. quiche emits qlog events — standardised JSON records covering packet sends/receives, loss events, CCA state transitions, and cwnd/bytes-in-flight time series. These are the substrate on which the 2026-05-12 post's death-spiral diagnosis ran: instrumenting quiche's qlog output with packet-loss events and building visualisations was "how we zoomed into that region" of the test failure.

Architectural shape

quiche sits in a position analogous to kernel TCP but deliberately different:

  • Kernel TCP = congestion control + retransmission + flow control + ACK clocking inside the OS, called by user-space sockets.
  • quiche = all of the above, packaged as a library + shipped with the application, called by the embedder's I/O loop.

The consequence the 2026-05-12 post makes explicit: kernel TCP CUBIC's canonical commit-path has access to the ACK-processing callback (CA_EVENT_TX_START); quiche's port of the same 2017-era idle-period adjustment had to implement the logic inside on_packet_sent() — and that structural difference is exactly where the death-spiral bug hid for five years (2020 port → 2026 discovery). Canonical patterns/userspace-port-of-kernel-primitive-risk instance.

2020 CUBIC port — the structural fault line

Cloudflare's original CUBIC-for-quiche post (2020) ported the 2017 Linux-kernel CUBIC after idle optimisation — "shift epoch_start forward by the idle duration rather than resetting it, so the CUBIC growth curve picks up where it left off" — into on_packet_sent():

// cubic.rs — on_packet_sent() (2020 port, simplified)
fn on_packet_sent(&mut self, bytes_in_flight: usize, now: Instant, ...) {
    if bytes_in_flight == 0 {
        let delta = now - self.last_sent_time;
        self.congestion_recovery_start_time += delta;
    }
    self.last_sent_time = now;
}

What the 2020 port missed: Linux's own 1-week follow-up commit (c2e7204d180f) fixed a bug in the original kernel change — "do not set epoch_start in the future" — noting that "tracking idle time in bictcp_cwnd_event() is imprecise, as epoch_start is normally set at ACK processing time, not at send time". quiche inherited the pre-fix logic.

2026-05-12 fix

Three lines of added state and logic:

// cubic.rs — on_packet_sent() (2026 fix)
fn on_packet_sent(&mut self, bytes_in_flight: usize, now: Instant, ...) {
    if bytes_in_flight == 0 {
        if let Some(recovery_start_time) = r.congestion_recovery_start_time {
            // Measure idle from the most recent activity: either the
            // last ACK (approximating when bif hit 0) or the last
            // send, whichever is later.
            let idle_start = cmp::max(cubic.last_ack_time, cubic.last_sent_time);
            if let Some(idle_start) = idle_start {
                if idle_start < now {
                    let delta = now - idle_start;
                    r.congestion_recovery_start_time =
                        Some(recovery_start_time + delta);
                }
            }
        }
    }
}

New state variable last_ack_time, updated on every ACK; the idle-delta anchor is max(last_ack_time, last_sent_time). When bytes_in_flight dips transiently to zero between an ACK and the next send (the death-spiral trigger at minimum cwnd), last_ack_time > last_sent_time and the delta captures the true ~0-ms processing gap, not the full ~RTT gap. For genuine application idleness, last_ack_time is far in the past and the original epoch-shift behaviour is preserved.

See patterns/measure-idle-from-last-ack-not-last-send for the general pattern.

Seen in

Last updated · 542 distilled / 1,571 read