PATTERN Cited by 1 source
Measure idle from last-ACK, not last-send¶
Pattern¶
When detecting idleness in a bidirectional rate-adaptive
protocol, anchor the idle duration on the most recent activity
across both directions — not on the most recent activity in one
direction only. Concretely: use
max(last_ack_time, last_sent_time) as the idle-start anchor,
not last_sent_time alone.
Canonical instance: Cloudflare's 2026-05-12 three-line fix to quiche's CUBIC congestion controller (Source: sources/2026-05-12-cloudflare-when-idle-isnt-idle-how-a-linux-kernel-optimization-became-a-quic-bug).
The structural problem the pattern solves¶
A rate-adaptive protocol that wants to detect "the peer / application has been idle for T seconds" needs a measurement anchor — a timestamp against which to compute idle duration. The naive choice is "last time I sent a packet", because send-time is cheap to record and most natural for a sender-side state machine.
At large windows this works: sends are frequent, the sender
is always busy, and the gap between consecutive sends is a fine
proxy for idleness. At minimum-window operation
(CUBIC minimum cwnd = 2
packets, or equivalently small-window TCP / slow-link
scenarios), the cheap proxy breaks:
cwnd = 2 × MSS→ after each RTT, both packets are ACKed,bytes_in_flightdrops to zero, and then the sender transmits the next burst.last_sent_timeis the timestamp of the start of the previous RTT cycle.- The real gap between "peer was last active" and "next send" is the ACK-processing delay + scheduler-dispatch delay, typically microseconds.
- But
now − last_sent_timeis ~RTT — tens of milliseconds or more.
So the naive anchor claims "~RTT of idleness" in a scenario where the connection was actively congestion-limited. This is false idle detection.
The fix¶
Add a secondary anchor: last_ack_time, updated on every
incoming ACK. Compute the idle-start as the later of the two
timestamps:
// cubic.rs — on_packet_sent() (2026-05-12 fix)
if bytes_in_flight == 0 {
if let Some(recovery_start_time) = r.congestion_recovery_start_time {
let idle_start = cmp::max(cubic.last_ack_time, cubic.last_sent_time);
if let Some(idle_start) = idle_start {
if idle_start < now {
let delta = now - idle_start;
r.congestion_recovery_start_time =
Some(recovery_start_time + delta);
}
}
}
}
Behaviour by regime:
- Minimum-
cwnd, ACK just landed.last_ack_time ≈ now→idle_start ≈ now→delta ≈ 0→ no spurious epoch advance → no death spiral. - Genuine application idleness, no recent ACKs.
last_ack_timeis far in the past (older thanlast_sent_time) →idle_start = last_sent_time(the original 2020 behaviour) →delta = now − last_sent_timecaptures the true idle duration → CUBIC epoch correctly shifts forward → growth curve shape preserved.
Generalisation¶
The pattern generalises to any rate-adaptive protocol that wants to distinguish "peer activity paused" from "local transmission paused":
- CCAs with idle-period adjustments — CUBIC, BBRv2 / BBRv3 probe-RTT timers, TCP RACK reordering-window heuristics.
- Application-level rate limiters / token buckets. If a bucket refills based on "seconds since last withdrawal" but the consumer is rate-limited downstream, the bucket may incorrectly refill during normal operation.
- Keepalive / idle-timeout logic. Connections that close on "no sent traffic for N seconds" may close prematurely on small-window downloads where the application has data waiting but the network hasn't granted window.
- Adaptive batching in message-oriented systems. Batching algorithms that coalesce based on "time since last message" will mis-batch under small-window backpressure.
In every case, the corrective move is the same: don't anchor on one direction's activity alone — anchor on the most recent activity across both directions.
Boundary-condition guard¶
The pattern also requires a second discipline: ensure the
epoch / recovery boundary stays on the correct side of wall-
clock time. The 2026-05-12 fix includes if idle_start < now
as a guard. Without it, future-timestamp values (from clock
skew, manual adjustments, or bugs further up the stack) would
produce negative deltas and corrupt the epoch. Linux TCP's
2017 follow-up commit titled "tcp_cubic: do not set
epoch_start in the future" is the kernel-side expression of
the same guard.
Seen in¶
- sources/2026-05-12-cloudflare-when-idle-isnt-idle-how-a-linux-kernel-optimization-became-a-quic-bug
— canonical wiki instance. Cloudflare's three-line fix to
quiche adds
last_ack_timeas a state variable and usesmax(last_ack_time, last_sent_time)as the idle-delta anchor insideon_packet_sent(). Restores 100% test pass rate on the CUBIC-minimum-cwnd corner-case test that had been failing ~60% of runs.
Related¶
- concepts/bytes-in-flight — the ambiguous signal the pattern disambiguates by adding a second anchor.
- concepts/false-idle-detection — the failure mode this pattern fixes.
- concepts/minimum-cwnd-death-spiral — the specific
failure shape at CUBIC minimum
cwnd. - concepts/cubic-epoch — the state variable whose arithmetic gets corrupted by the single-anchor mistake.
- concepts/ack-clock — the forcing function that makes the single-anchor bug fire once per RTT.
- systems/cubic-congestion-control
- systems/quiche
- patterns/userspace-port-of-kernel-primitive-risk
- companies/cloudflare