Skip to content

PATTERN Cited by 1 source

Callback-slice VM in Go

Problem

You need to build a bytecode-VM-class fast interpreter for a dynamic expression language in Go. The mainstream designs (big-switch VM, tail-call continuation interpreter) don't translate well:

Solution

Emit each instruction as a Go closure pushed onto a []func(*VirtualMachine) int slice. The VM loop walks the slice, invoking each callback in turn; each callback returns an offset to advance the instruction pointer.

The pattern has two ingredients.

1. The VM is trivial

func (vm *VirtualMachine) execute(p *Program) (eval, error) {
    code := p.code
    ip := 0
    for ip < len(code) {
        ip += code[ip](vm)
        if vm.err != nil {
            return nil, vm.err
        }
    }
    if vm.sp == 0 {
        return nil, nil
    }
    return vm.stack[vm.sp-1], nil
}

One for-loop, one indirect call per opcode, one error check. That's it. No switch, no opcode decode, no case explosion.

2. The compiler emits closures, not bytecode

func (c *compiler) emitPushNull() {
    c.emit(func(vm *VirtualMachine) int {
        vm.stack[vm.sp] = nil
        vm.sp++
        return 1
    })
}

func (c *compiler) emitPushColumn_text(offset int, col collations.TypedCollation) {
    c.emit(func(vm *VirtualMachine) int {
        vm.stack[vm.sp] = newEvalText(vm.row[offset].Raw(), col)
        vm.sp++
        return 1
    })
}

Instruction arguments (offset, col) are captured in closure state by the Go compiler. No encoding, no decoding, no bytecode format to keep in sync with the VM.

Control flow

Each callback returns an int offset:

  • return 1 — advance to the next instruction (sequential).
  • return N — jump forward by N (forward branch).
  • return -N — jump backward by N (loop).
  • return 0 with an error sentinel — halt or deoptimise.

Properties

Property Value
Dispatch cost One indirect call per opcode
Runtime memory Zero on most opcodes (see static specialization)
Compile-time memory One closure allocation per instruction
VM ↔ compiler sync None — there's no bytecode encoding
Instruction argument encoding Free (closure capture)
Control flow Integer offsets returned by callbacks
Maintenance cost Low; each opcode is a self-contained function

Canonical example: Vitess evalengine

Vicent Martí's 2025 Vitess evalengine rewrite is the canonical wiki instance. The VM is at go/vt/vtgate/evalengine/vm.go; the entire execution engine is "hardly more complicated than this" (Source: sources/2025-04-05-planetscale-faster-interpreters-in-go-catching-up-with-cpp).

Composed with:

Benchmark result: VM geomean −48.60% sec/op vs the original AST interpreter; faster than MySQL C++ on 4 of 5 benchmarks; zero memory allocations on 4 of 5 benchmarks.

When to use

  • You're building a performance-critical interpreter in Go. C/C++/Rust have better compiler support for the alternatives.
  • Instructions are high-level (each opcode does substantial work). If opcodes are trivial (native ADD), JIT becomes worthwhile.
  • The source language has strong static type information so you can compose with static specialization and avoid runtime type dispatch.
  • Program size is moderate. Each instruction has a closure allocation; very large programs may stress compile-time memory.

When not to use

  • Languages where types can only be observed at runtime. A dynamic-typing-heavy VM benefits more from quickening or JIT type speculation than from static specialization.
  • Non-Go languages. C/C++/Rust should use tail-call continuation interpreters (musttail) for lower dispatch cost.
  • VMs with extremely hot, trivial opcodes. JIT wins when dispatch is >~30% of runtime.

Caveats

  • Closure-allocation count scales with program size. Each instruction is a closure object; a 10k-instruction query plan is 10k closures. Acceptable when compile-time cost is amortised across many executions; expensive for one-shot queries.
  • Not debuggable as bytecode. You can't dump the program to a .pyc-like byte stream. Debugging means reading Go source + flamegraphs.
  • Ties implementation to Go. Porting a callback-slice VM to C would require a manual closure layout; to Python, different calling conventions. This is a Go-specific sweet spot.

Seen in

Last updated · 319 distilled / 1,201 read