Skip to content

CONCEPT Cited by 1 source

Compiler-generated race condition

Definition

A compiler-generated race condition is a data race whose root cause is not in the user's source code but in the sequence of machine opcodes the compiler or assembler emitted for a correct-looking source construct. The user's source is free of races; the assembly is not.

This is distinct from:

  • Source-level data races — two threads write the same variable without synchronization. Detectable by ThreadSanitizer / Go's -race / Rust's borrow checker.
  • ABI violations — compiler emits code that doesn't respect the calling convention; straightforward to find because the failure is reproducible and local.
  • Memory-ordering bugs — compiler reorders loads/stores across a barrier. Addressed by memory models.

The compiler-generated race is harder to diagnose because reading the source code reveals nothing wrong. The behaviour only appears under a specific combination of: the emitted codegen shape, the runtime's concurrent observers, and specific inputs (stack frame size, immediate values, etc).

Typical shape

  1. The compiler translates a semantically-atomic source operation into multiple machine opcodes.
  2. The intermediate state between opcodes is observable by a concurrent actor — usually the language runtime (GC, scheduler, signal handler).
  3. The observer's invariant is violated by reading the intermediate state → crash.

See concepts/split-instruction-race-window for the specific case where the compiler's decomposition is forced by ISA encoding limits.

Canonical wiki instance

sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler:

  • Source — a Go function with a stack frame slightly larger than 4 KiB. Perfectly correct source code.
  • Compiler — Go toolchain's arm64 backend emits a function epilogue that adjusts SP via two ADD opcodes (forced by arm64's 12-bit ADD immediate).
  • Concurrent observer — Go's garbage collector running stack-scan during an async preemption that lands between the two SP-adjustment opcodes.
  • Invariant violated — SP is partially adjusted; GC's stack unwinder reads garbage as a return address.
  • Resultfatal error: traceback did not unwind completely or SIGSEGV at m.incgo+0x118.

Why these are hard to find

  • Source code audits come up empty — the race is below the source level. Cloudflare's team described exhausting a code audit before getting to the coredump.
  • Runtime log patterns implicate the wrong component — every crash had (*NetlinkSocket).Receive on the stack; reasonable theories included unsafe.Pointer misuse in the netlink library.
  • Small variations mask the bug — "some of the behavior is still puzzling. It's a one-instruction race condition, so it's unsurprising that small changes could have large impact." Same reproducer crashed on go1.23.4 but not on go1.23.9, even though the split ADD was still emitted.

The smoking gun is usually at the disassembly level

The decisive evidence in Cloudflare's 2025-10 bug came from a coredump loaded in dlv: the faulting PC was sitting between two specific opcodes inside a function epilogue. No source-level tool could have named that location; it required disass -a <start> <end> to see the split ADD pair. See patterns/isolated-reproducer-for-race-condition for the follow-up step — reducing the observation to a minimal stdlib-only reproducer.

The fix lives in the toolchain

By definition, compiler-generated races cannot be fixed in user code — the user code is already correct. The fix must be upstream in the compiler / assembler. See patterns/upstream-the-fix. User-code workarounds (e.g. try to make the stack frame smaller to avoid the split ADD) are brittle — the bug remains latent for future code.

Seen in

Last updated · 200 distilled / 1,178 read