SYSTEM Cited by 1 source
Go Netlink (vishvananda/netlink)¶
github.com/vishvananda/netlink
is the widely-used Go library for interacting with the
Linux Netlink kernel interface. Used
by Cloudflare's Magic Transit
and Magic WAN
products to configure the kernel networking stack.
Wiki-scope role: red herring in the 2025-10 arm64 bug¶
In Cloudflare's months-long hunt for a recurring fatal Go panic
on arm64
(sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler),
every observed crash stack had (*NetlinkSocket).Receive on
it. Both Cloudflare's internal traces and an existing
upstream issue
(golang#73259)
implicated the same function. Two plausible theories:
- Go Netlink bug — the library uses
unsafe.Pointerand could be invoking undefined behavior that happened to manifest only on arm64. - Go runtime bug — triggered by something specific to
(*NetlinkSocket).Receivefor a reason not yet known.
The real answer was (2) but for a mundane reason:
(*NetlinkSocket).Receive happened to have a large enough
stack frame that the Go arm64 compiler emitted its epilogue
as a split ADD $n, RSP, RSP; ADD $(m<<12), RSP, RSP pair.
Any function with a frame size slightly larger than
1 << 12 = 4096 bytes would have been equally exposed — the
library was the accidental trigger, not the culprit.
Investigative lesson¶
The unsafe.Pointer usage in Go Netlink was a red flag — a
legitimate source of memory corruption on a typical debugging
day — and made the "corrupt stack via user code" hypothesis
plausible. Code audits revealed nothing. The bug's true
location was a toolchain-level instruction-sequencing
issue that no amount of reviewing the library's source could
have surfaced. "It was fairly clear from the beginning that
the panic was remote from the actual bug."
The ultimate disambiguation came from a production coredump
loaded into dlv: the goroutine's program counter was
paused between two specific opcodes (ADD $80, RSP, RSP and
ADD $(16<<12), RSP, RSP) inside the function's epilogue —
visible only at the disassembly level, not at the Go source
level.
Seen in¶
- sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler — Go Netlink was the on-stack library in every observed crash; the root cause was in Go's arm64 compiler, not in this library.
Related¶
- systems/linux-netlink — the kernel interface this library wraps.
- systems/go-compiler — actual root cause of the 2025-10 bug.
- systems/go-runtime-scheduler — the async-preemption mechanism that made the race reachable.
- concepts/async-preemption-go.