PATTERN Cited by 1 source
Preemption-safe compiler emit¶
Intent¶
When a language runtime supports async preemption at any instruction boundary and the runtime's observers (GC, scheduler, signal handler) read runtime-observable state (stack pointer, frame pointer, pointer barriers, ...), the compiler must emit code that never leaves that state partially updated at any instruction boundary.
Concretely: if updating the state requires multiple opcodes because of ISA encoding limits, build the full new value in a scratch register first, then apply it to the runtime-observable target with a single indivisible register-form opcode.
The anti-pattern¶
Emitting a split immediate operation directly to the target:
; arm64 stack pointer adjustment, pre-go1.23.12
ADD $8, RSP, R29
ADD $(16<<12), R29, R29
ADD $16, RSP, RSP
ADD $(16<<12), RSP, RSP ; race window starts after previous ADD
RET
Between the two RSP adjustments, RSP holds a value that is
neither the old nor the new stack pointer. Any
async preemption landing here
leaves the runtime unable to unwind
the stack.
The pattern¶
Build the full immediate in a scratch register, then apply with one indivisible opcode:
; arm64 stack pointer adjustment, go1.23.12+
LDP -8(RSP), (R29, R30)
MOVD $32, R27
MOVK $(1<<16), R27
ADD R27, RSP, RSP ; indivisible
RET
The MOVD + MOVK pair updates a scratch register (not
runtime-observable). The single ADD R27, RSP, RSP applies
the update to the observable target in one opcode. Preemption
may land before or after, but not during.
What counts as "runtime-observable"¶
The runtime's safe-point analysis defines the surface. In Go:
- Stack pointer (
sp) — read during stack unwinding for GC scan, panic, traceback. - Frame pointer / saved link register — read during traceback to identify the calling function.
- Heap pointers with write-barrier semantics — read by the GC's concurrent mark phase. The compiler's write-barrier sequence is required to be observable-atomic.
- Go-routine's local
gpointer register — used by the preemption handler to locate scheduler state.
Any compiler update to these must pass through a scratch register if the update cannot fit in a single opcode.
Why the assembler level is not sufficient¶
The Go pre-go1.23.12 architecture expressed the intent at the
obj.Prog IR level as a single logical ADD $n, RSP, RSP and
relied on the assembler (asm7.go's conclass) to split the
immediate when necessary. The IR-level abstraction was leaky —
downstream passes and runtime observers couldn't tell that
what looked like one operation was actually two.
The fix promotes preemption-safety awareness to the
compiler level. The compiler now emits code that is already
decomposed through a scratch register, so the assembler has
nothing to split. See systems/go-compiler (patch in
cmd/internal/obj/arm64/obj7.go).
Generalisation¶
Any compiler targeting a runtime with async preemption must audit codegen for all cases where:
- The target register is runtime-observable at preemption time.
- The operation requires multiple opcodes due to ISA encoding limits, register pressure, or other reasons.
If both hold, use a scratch register + indivisible apply. If only one holds (e.g. the target isn't observable, or the operation fits in one opcode), the split is safe.
Seen in¶
- sources/2025-10-08-cloudflare-we-found-a-bug-in-gos-arm64-compiler — canonical wiki instance. Go's arm64 backend pre-go1.23.12 emitted the anti-pattern for function-epilogue SP adjustments on frames > 4 KiB. Fix in go1.23.12 / go1.24.6 / go1.25.0 applies the pattern.
Related¶
- systems/go-compiler — where the fix landed
(
cmd/internal/obj/arm64/obj7.go). - systems/go-assembler — previously did the immediate splitting; no longer required for this case.
- systems/arm64-isa — the architectural constraint.
- concepts/split-instruction-race-window — the failure class this pattern prevents.
- concepts/async-preemption-go — the runtime mechanism the pattern must cooperate with.
- patterns/upstream-the-fix — the meta-pattern; the preemption-safe emit fix is ideally upstreamed into the toolchain rather than worked around in user code.