PATTERN

Callback-slice VM in Go¶

Problem¶

You need to build a bytecode-VM-class fast interpreter for a dynamic expression language in Go. The mainstream designs (big-switch VM, tail-call continuation interpreter) don't translate well:

A big-switch VM in Go is often compiled with binary-search dispatch instead of a jump table, with no reliable way to force the jump table. Register spillage in the giant switch function hurts further.
Tail-call continuation loops depend on guaranteed tail-call optimization (LLVM musttail); Go's compiler doesn't guarantee tail calls and the stack grows.
A JIT is not worth the complexity if instruction dispatch is under ~20% of runtime.

Solution¶

Emit each instruction as a Go closure pushed onto a []func(*VirtualMachine) int slice. The VM loop walks the slice, invoking each callback in turn; each callback returns an offset to advance the instruction pointer.

The pattern has two ingredients.

1. The VM is trivial¶

func (vm *VirtualMachine) execute(p *Program) (eval, error) {
    code := p.code
    ip := 0
    for ip < len(code) {
        ip += code[ip](vm)
        if vm.err != nil {
            return nil, vm.err
        }
    }
    if vm.sp == 0 {
        return nil, nil
    }
    return vm.stack[vm.sp-1], nil
}

One for-loop, one indirect call per opcode, one error check. That's it. No switch, no opcode decode, no case explosion.

2. The compiler emits closures, not bytecode¶

func (c *compiler) emitPushNull() {
    c.emit(func(vm *VirtualMachine) int {
        vm.stack[vm.sp] = nil
        vm.sp++
        return 1
    })
}

func (c *compiler) emitPushColumn_text(offset int, col collations.TypedCollation) {
    c.emit(func(vm *VirtualMachine) int {
        vm.stack[vm.sp] = newEvalText(vm.row[offset].Raw(), col)
        vm.sp++
        return 1
    })
}

Instruction arguments (offset, col) are captured in closure state by the Go compiler. No encoding, no decoding, no bytecode format to keep in sync with the VM.

Control flow¶

Each callback returns an int offset:

return 1 — advance to the next instruction (sequential).
return N — jump forward by N (forward branch).
return -N — jump backward by N (loop).
return 0 with an error sentinel — halt or deoptimise.

Properties¶

Property	Value
Dispatch cost	One indirect call per opcode
Runtime memory	Zero on most opcodes (see static specialization)
Compile-time memory	One closure allocation per instruction
VM ↔ compiler sync	None — there's no bytecode encoding
Instruction argument encoding	Free (closure capture)
Control flow	Integer offsets returned by callbacks
Maintenance cost	Low; each opcode is a self-contained function

Canonical example: Vitess evalengine¶

Vicent Martí's 2025 Vitess evalengine rewrite is the canonical wiki instance. The VM is at go/vt/vtgate/evalengine/vm.go; the entire execution engine is "hardly more complicated than this" (Source: ).

Composed with:

patterns/static-type-specialized-bytecode — every closure is a type-specialised opcode emitted by Vitess's semantic analyzer based on schema types.
patterns/vm-ast-dual-interpreter-fallback — when a specialized closure hits a value-dependent type promotion, it sets vm.err = errDeoptimize and execution falls back to the AST interpreter.

Benchmark result: VM geomean −48.60% sec/op vs the original AST interpreter; faster than MySQL C++ on 4 of 5 benchmarks; zero memory allocations on 4 of 5 benchmarks.

When to use¶

You're building a performance-critical interpreter in Go. C/C++/Rust have better compiler support for the alternatives.
Instructions are high-level (each opcode does substantial work). If opcodes are trivial (native ADD), JIT becomes worthwhile.
The source language has strong static type information so you can compose with static specialization and avoid runtime type dispatch.
Program size is moderate. Each instruction has a closure allocation; very large programs may stress compile-time memory.

When not to use¶

Languages where types can only be observed at runtime. A dynamic-typing-heavy VM benefits more from quickening or JIT type speculation than from static specialization.
Non-Go languages. C/C++/Rust should use tail-call continuation interpreters (musttail) for lower dispatch cost.
VMs with extremely hot, trivial opcodes. JIT wins when dispatch is >~30% of runtime.

Caveats¶

Closure-allocation count scales with program size. Each instruction is a closure object; a 10k-instruction query plan is 10k closures. Acceptable when compile-time cost is amortised across many executions; expensive for one-shot queries.
Not debuggable as bytecode. You can't dump the program to a .pyc-like byte stream. Debugging means reading Go source + flamegraphs.
Ties implementation to Go. Porting a callback-slice VM to C would require a manual closure layout; to Python, different calling conventions. This is a Go-specific sweet spot.

Seen in¶

— canonical wiki instance. Vitess evalengine VM. Geomean −48.60% sec/op vs AST baseline, catches up with MySQL's C++ implementation on 4/5 benchmarks. First wiki instance of a production bytecode-less Go VM.