Skip to content

SYSTEM Cited by 1 source

Go Netlink (vishvananda/netlink)

github.com/vishvananda/netlink is the widely-used Go library for interacting with the Linux Netlink kernel interface. Used by Cloudflare's Magic Transit and Magic WAN products to configure the kernel networking stack.

Wiki-scope role: red herring in the 2025-10 arm64 bug

In Cloudflare's months-long hunt for a recurring fatal Go panic on arm64 (sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler), every observed crash stack had (*NetlinkSocket).Receive on it. Both Cloudflare's internal traces and an existing upstream issue (golang#73259) implicated the same function. Two plausible theories:

  1. Go Netlink bug — the library uses unsafe.Pointer and could be invoking undefined behavior that happened to manifest only on arm64.
  2. Go runtime bug — triggered by something specific to (*NetlinkSocket).Receive for a reason not yet known.

The real answer was (2) but for a mundane reason: (*NetlinkSocket).Receive happened to have a large enough stack frame that the Go arm64 compiler emitted its epilogue as a split ADD $n, RSP, RSP; ADD $(m<<12), RSP, RSP pair. Any function with a frame size slightly larger than 1 << 12 = 4096 bytes would have been equally exposed — the library was the accidental trigger, not the culprit.

Investigative lesson

The unsafe.Pointer usage in Go Netlink was a red flag — a legitimate source of memory corruption on a typical debugging day — and made the "corrupt stack via user code" hypothesis plausible. Code audits revealed nothing. The bug's true location was a toolchain-level instruction-sequencing issue that no amount of reviewing the library's source could have surfaced. "It was fairly clear from the beginning that the panic was remote from the actual bug."

The ultimate disambiguation came from a production coredump loaded into dlv: the goroutine's program counter was paused between two specific opcodes (ADD $80, RSP, RSP and ADD $(16<<12), RSP, RSP) inside the function's epilogue — visible only at the disassembly level, not at the Go source level.

Seen in

Last updated · 200 distilled / 1,178 read