PATTERN Cited by 1 source
Userspace port of kernel primitive — inherited-bug risk¶
Pattern¶
When porting a kernel-space primitive to user space, the callback / event boundaries available in the kernel are rarely available one-for-one in user space — and follow-up bug fixes to the kernel primitive may not be visible to the userspace implementer years later when the userspace bug surfaces.
Two distinct risks compose:
- Callback-shape mismatch. The kernel version hooks into
a specific OS event (e.g.
CA_EVENT_TX_STARTin Linux TCP for the "about to transmit" moment). The user-space version has to approximate that hook from a different call site (e.g. inside a send-path function), and the approximation may be structurally weaker. - Follow-up-fix visibility gap. If the kernel primitive ships a follow-up bug fix within days or weeks, the userspace port that happened years later may inherit only the original fix. The reviewer of the port sees the "canonical" kernel commit and reads it as the definitive version; the week-later correction is in a different commit with different authors and may not be linked from the primary.
Canonical instance: the 2020 port of Linux TCP CUBIC's "after
idle" optimisation into quiche's
on_packet_sent() inherited the 2017 Linux fix but not the
1-week-later follow-up that said "do not set epoch_start in
the future" (Source:
sources/2026-05-12-cloudflare-when-idle-isnt-idle-how-a-linux-kernel-optimization-became-a-quic-bug).
The canonical instance in detail¶
2017. Jana Iyengar reports: Linux TCP CUBIC inflates cwnd
dangerously after an application-idle period because delta_t
= now − epoch_start grows during idleness.
Neal Cardwell corrects an initial "reset epoch" proposal:
resetting would restart the growth curve from cwnd's current
value, behaving like a loss. The accepted fix
(30927520dbae)
shifts epoch_start forward by the idle duration, preserving
the growth-curve shape.
~1 week later. A follow-up commit
(c2e7204d180f)
fixes a bug in the first fix: "Tracking idle time in
bictcp_cwnd_event() is imprecise, as epoch_start is normally
set at ACK processing time, not at send time. Doing a proper
fix would need to add an additional state variable, and does
not seem worth the trouble, given CUBIC bug has been there
forever before Jana noticed it. Let's simply not set
epoch_start in the future, otherwise bictcp_update() could
overflow and CUBIC would again grow cwnd too fast."
The fix ships as a guard: if the arithmetic would set
epoch_start > now, clamp it.
2020. Cloudflare ports the 2017 CUBIC-after-idle
optimisation into quiche. Because QUIC runs in user space,
there is no CA_EVENT_TX_START callback. The port instead
puts the idle-detection logic inside on_packet_sent(), using
bytes_in_flight == 0 as the idle predicate and
now - last_sent_time as the delta:
// cubic.rs — on_packet_sent() (2020 port, buggy)
fn on_packet_sent(&mut self, bytes_in_flight: usize, now: Instant, ...) {
if bytes_in_flight == 0 {
let delta = now - self.last_sent_time;
self.congestion_recovery_start_time += delta;
}
self.last_sent_time = now;
}
This port never received the 1-week-later kernel guard.
Worse, the callback-shape mismatch (no ACK-processing hook)
changes the semantics of bytes_in_flight == 0 from
"application was idle between ACK and send" to "transient
drain between ACK and next send at minimum cwnd". Those are
not the same condition.
2026-05-12. Cloudflare's CI integration test
(patterns/adversarial-corner-case-test-for-recovery)
surfaces the
CUBIC minimum-cwnd death
spiral. The fix adds a last_ack_time state variable to
approximate the kernel's ACK-processing anchor — essentially
the "additional state variable" the 2017 follow-up commit
mentioned but declined to add.
Structural causes of the gap¶
- Different call-site primitives. Kernel TCP has
CA_EVENT_TX_START,cwnd_event,pkts_acked, and others as distinct hooks. User-space QUIC hason_packet_sent,on_ack_received,on_packet_lostas the closest equivalents — mostly but not exactly overlapping. Any logic that depends on the fine-grained distinction between those hooks will port wrong. - Follow-up commits don't self-link. Git history shows
parent commits but not child commits. A reviewer finding
the 2017 primary commit by title or by the Cloudflare
blog's link to it will not automatically see the
correction commit unless they check
git log --followon the file. - The bug in the uncorrected code is invisible at normal
operation. It only fires at minimum
cwnd— corner of state space — so the reviewed PR passes normal tests, the port ships, and the bug lies dormant for years. - Bug visibility differs by substrate. At kernel-level,
CUBIC's
epoch-in-the-futurebug manifests under similar conditions as quiche's death spiral, but the Linux community caught it in a week. Cloudflare caught the quiche version in six years — partly because QUIC adoption at minimum-cwnd-reaching regimes took time, and partly because qlog-based diagnosis is newer.
Defensive disciplines¶
- When porting, read the file history, not just the named commit. Check for follow-up commits that cite the primary in their commit message.
- Map the kernel's callback semantics to userspace anchors
explicitly. If the kernel primitive runs on ACK
processing and the userspace port runs on send, document
that the semantic changes and argue why that's OK — or fix
the port to approximate the ACK anchor (e.g. via
last_ack_time). - Test the ported primitive at the same corner cases the kernel tests it in. CUBIC's post-loss minimum-cwnd regime is exactly the scenario the kernel's idle-period fix addresses; the port's regression test should hit it.
- Include the kernel-primitive authors as reviewers when possible. The people who wrote the fix and the fix-of- the-fix are most likely to notice if a port misses the subtlety.
Generalisation beyond CCAs¶
The pattern applies to any kernel primitive being ported to user space:
- io_uring usage patterns ported from kernel async-I/O idioms.
- epoll / kqueue event-loop abstractions in user-space frameworks.
- Raw-socket / AF_PACKET protocol implementations moved into user-space packet processors.
- Scheduler heuristics (e.g. completely-fair-scheduler logic) adapted into user-space cooperative-scheduling runtimes.
In every case: the kernel's events and primitives don't map one-to-one to user-space callsites; the first port is rarely the last; and the kernel community's follow-up fixes are the teacher you want to learn from.
Seen in¶
- sources/2026-05-12-cloudflare-when-idle-isnt-idle-how-a-linux-kernel-optimization-became-a-quic-bug
— canonical wiki instance. The 2020 port of Linux TCP
CUBIC's 2017 "after idle" fix to quiche's
on_packet_sent()inherited the first kernel commit but missed the 1-week- later follow-up guard. Six years later, Cloudflare's CI test surfaced the inherited bug as a minimum-cwnd death spiral.
Related¶
- systems/quiche — the user-space embedder where the port lived.
- systems/cubic-congestion-control — the primitive being ported.
- concepts/user-space-congestion-control — the broader category this pattern is a specific failure mode within.
- concepts/cubic-epoch — the state variable whose invariant gets violated when the port misses the follow-up fix.
- concepts/minimum-cwnd-death-spiral — the specific failure shape.
- concepts/false-idle-detection — the mechanism by which the port's logic fires spuriously.
- patterns/measure-idle-from-last-ack-not-last-send — the specific corrective pattern.
- companies/cloudflare