Skip to content

CLOUDFLARE 2026-06-22

How we found a bug in the hyper HTTP library

Summary

Cloudflare's Images team discovered a race condition in hyper (Rust's most widely used HTTP library) that caused silent response truncation on their Images binding. After a December 2025 rearchitecture that replaced the FL intermediary with local Unix sockets, larger image responses were intermittently cut short — the HTTP response returned 200 OK with a correct Content-Length, but only a fraction of the body arrived. After six weeks of debugging across application tracing, distributed tracing, and finally kernel-level strace, the root cause was identified as a discarded Poll::Pending in hyper's HTTP/1 dispatch loop. The fix was four lines of code; the investigation touched every layer of the stack.

Key Takeaways

  1. The bug was invisible to application-level observability. Tracing, logging, and HTTP status codes all reported success. The Images service genuinely believed it had written the full response. Only strace — recording raw syscalls — revealed the premature shutdown(fd, SHUT_WR) immediately after a partial sendto.

  2. A performance improvement surfaced a latent bug. The December 2025 migration from FL (network sockets) to a local Unix-socket intermediary made the system faster overall, but the new reader consumed data slightly slower than FL in certain windows. This few-milliseconds difference in backpressure was enough to fill the socket buffer and trigger the race condition that had existed in hyper for years.

  3. let _ = expr in Rust discards Poll::Pending silently. In hyper's dispatch.rs, the poll_flush result was discarded with let _. When the socket buffer was full and the flush returned Poll::Pending, the loop proceeded to check wants_read_again()false → returned Poll::Ready(Ok(())) → triggered shutdown with data still buffered. This is a Rust async footgun: discarding a Poll value without checking it can silently skip incomplete I/O.

  4. Approximately 219 KB was consistently the amount delivered in failing requests. This matched the kernel socket buffer size in production, confirming the hypothesis: only the initial chunk that fit in the socket buffer was sent before the connection was closed.

  5. The bug reproduced only under real concurrency on the production path. Local curl, integration tests on macOS/Debian VMs, and replayed pcap traces never triggered it. The production Workers runtime reader paused just long enough (order of milliseconds) to fill the socket outbound buffer at the critical moment.

  6. strace observability changes the timing. Broadening the syscall filter slowed the process enough to shift the timing window and make the bug disappear — a Heisenbug. Only a narrow filter kept overhead minimal enough to still trigger the failure.

  7. Two fixes were explored. The initial fix returned Poll::Pending from the dispatch loop when flush was incomplete — correct for Cloudflare's use case but problematic for keepalive connections. The upstream-accepted fix places the flush check in poll_shutdown(), ensuring all buffered data drains before the socket is closed, without altering the dispatch loop's polling semantics.

  8. A deterministic test was essential for the upstream contribution. A custom TCP stream wrapper that accepted 8 KB then returned Poll::Pending on all subsequent writes let the team write a test that reliably triggered the race without timing sensitivity.

Architectural Details

Request path (post-rearchitecture)

Client → Workers Runtime → Intermediary (local worker binding)
       → [Unix socket] → Images service (hyper) → [encodes image]
       → [hyper writes response to socket] → Intermediary → Workers Runtime → Client

The race condition sequence (failing case)

  1. Images service finishes encoding; hands full response (e.g. 14.9 MB) to hyper as one in-memory block.
  2. Hyper buffers the response; marks write state as Writing::Closed (encoding complete).
  3. poll_flush is called — socket accepts ~219 KB, buffer is full, returns Poll::Pending.
  4. let _ discards the Pending signal.
  5. wants_read_again() returns false (full request already consumed).
  6. Loop returns Poll::Ready(Ok(())) — signals "connection work is done."
  7. poll_shutdown() fires → shutdown(fd, SHUT_WR) syscall issued.
  8. Client receives 219 KB + EOF, despite expecting 14.9 MB.

The fix (upstream PR #4018)

pub(crate) fn poll_shutdown(
    &mut self,
    cx: &mut Context<'_>,
) -> Poll<io::Result<()>> {
    ready!(self.poll_flush(cx)?);
    Pin::new(&mut self.io).poll_shutdown(cx)
}

Ensures all buffered data is flushed before issuing shutdown. Leaves the dispatch loop unchanged, avoiding keepalive regressions.

Operational Numbers

  • Response truncation: ~219 KB delivered out of multi-MB responses (socket buffer size limit)
  • Reproduction rate: 19/25 requests failed in one batch run
  • Bug existed in hyper: 0.14.x through 1.8.x (multiple major versions, years)
  • Time to fix: 6 weeks investigation, 4 lines of code

Caveats

  • The article doesn't disclose the exact socket buffer size configuration in production.
  • The bug affects only HTTP/1.1 connections; HTTP/2 uses a different write path in hyper.
  • Cloudflare's production still runs an internal fork while awaiting the upstream release.

Source

Last updated · 559 distilled / 1,651 read