Skip to content

CONCEPT Cited by 1 source

Async preemption (Go)

Definition

Async preemption is the Go runtime's mechanism — introduced in Go 1.14 — for forcibly pausing a goroutine that has been running too long, without needing the goroutine to cooperate.

The sysmon monitoring thread watches each goroutine's run time. When a goroutine has executed for more than 10 ms (at time of writing), sysmon sends SIGURG to the OS thread (the m) running it. The signal handler mutates the program counter + stack to synthesise a call to runtime.asyncPreempt, which yields back to the scheduler. See preempt.go and preempt_arm64.s.

Before Go 1.14: cooperative scheduling

Pre-1.14, goroutines yielded only at explicit points: calls to runtime.Gosched(), compiler-injected yield points at function prologues, and I/O operations. A tight loop with no function calls could hold its OS thread indefinitely — one of the motivations for async preemption.

Async preemption widens the race window

Under cooperative scheduling the instant-of-yield is always a well-known program point (function prologue, explicit call). Under async preemption the instant-of-yield is any instruction boundary — which means any codegen sequence that momentarily violates a scheduler-observable invariant creates a reachable race window.

Canonical wiki instance: sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler. Go's arm64 compiler emitted a function epilogue that adjusted the stack pointer via two separate ADD opcodes (because of arm64's 12-bit ADD immediate limit). Cooperative scheduling would never yield between them — those opcodes are part of the generated epilogue, not a yield point. Async preemption could, and did, land exactly in that one-instruction window.

Interaction with stack unwinding

Preemption handlers walk the goroutine stack via (*unwinder).next. The unwinder dereferences sp to locate parent frames. If preemption lands at an instruction where sp has only been partially adjusted, the unwinder reads invalid frames and crashes. See concepts/stack-unwinding.

Design implication for compiler backends

Any architecture-specific code generator that emits multiple opcodes to update runtime-observable state (most importantly the stack pointer) must do so in a way that any intermediate state is either absent (atomic operation) or is not read by the runtime during preemption. The Cloudflare 2025-10 fix applies this: SP is now adjusted with a scratch register and an indivisible register-form ADD. See patterns/preemption-safe-compiler-emit.

Seen in

Last updated · 200 distilled / 1,178 read