CONCEPT Cited by 1 source
TLS close_notify¶
Definition¶
close_notify is the TLS protocol's graceful-shutdown
signal — an Alert record with description close_notify sent
by one peer to inform the other that it will not send any more
application data on this session. Defined in the TLS 1.2 RFC
5246 §7.2.1 and carried forward in TLS 1.3 (RFC 8446 §6.1).
Semantically, close_notify marks the end of the sender's
half of the session — the receiver can still send its own
application data and its own close_notify in return. After both
peers have exchanged close_notify, the session is orderly-
closed and the underlying transport (usually TCP) can be torn
down.
The buffered-trailer edge case¶
The interesting operational edge case — and the one
Fly.io's
2025-02 incident hit — is when the underlying transport has
buffered bytes behind the close_notify. Per the Fly.io
post:
"When a TLS session is orderly-closed, with a
CloseNotifyAlertrecord, the sender of that record has informed its counterparty that no further data will be sent. But if there's still buffered data on the underlying connection,TlsStreammishandles itsWaker, and we fall into a busy-loop."
The TLS library has to handle: "no more TLS-encrypted application data is coming, BUT I still have raw socket bytes queued — deliver those trailing bytes, then flag end-of-stream." If that bookkeeping miscounts against its Wakers, the readiness signal it sends up to Tokio becomes a spurious wake and the caller spins.
This is a protocol-state × async-state interaction bug — neither layer in isolation is wrong; their composition is.
Operational reality¶
Most TLS sessions don't orderly-close. Clients drop TCP
connections, time out, or close the socket without a trailing
close_notify — and modern TLS (and modern TLS libraries) treat
this as acceptable rather than a truncation attack, since
HTTP/1.1 and HTTP/2 have their own end-of-message framing.
This is why Fly.io's bug only manifested under a specific partner load-testing pattern:
"All of them sent small HTTP bodies and then terminated early. We figured some of those connections errored out, and set up the 'TLS CloseNotify happened before EOF' scenario."
Tens of thousands of short connections were enough to tickle the edge case on a couple of hosts.
Seen in¶
- sources/2025-02-26-flyio-taming-a-voracious-rust-proxy —
the buffered-trailer case on orderly
close_notifytriggered a Waker mis-fire in tokio-rustlsTlsStream, which putfly-proxyinto a 100% CPU busy-poll on two IAD edge hosts. Fix: rustls PR #1950, issue tokio-rustls#72.
Related¶
- systems/rustls / systems/tokio-rustls — the Rust TLS implementation + adapter where the 2025-02 bug lived.
- concepts/asyncread-contract — the layer above where the mis-handling surfaces.
- concepts/spurious-wakeup-busy-loop — the failure mode.
- companies/flyio