CONCEPT Cited by 1 source
Instruction dispatch cost¶
Instruction dispatch cost is the per-instruction overhead a bytecode or VM interpreter pays to fetch the next opcode, decide which handler to run, and jump to it — before executing any useful work for that instruction.
Dispatch cost is the primary thing JIT compilation eliminates. It's also the primary thing that makes one interpreter design faster than another.
What makes up dispatch cost¶
For a classic big-switch VM:
while (ip < len) {
switch (code[ip].opcode) { // ← dispatch: fetch + branch
case OP_ADD:
... // ← actual work
break;
}
ip++;
}
The dispatch step includes:
- Opcode fetch — load from instruction stream into register.
- Bounds check + loop overhead on the
while. - Switch branch — indirect jump (jump table) or cascaded compare-and-branch (binary search).
- Return to dispatch loop — after the case executes, control
flows back through the
break+ loop header for the next iteration. - Branch predictor pressure. The switch's indirect jump tends to cluster on one target per workload ("sticky branch"), which helps, but type-mixed workloads can thrash the BTB.
How dispatch cost varies by design¶
| Design | Dispatch cost | Notes |
|---|---|---|
| AST interpreter | Very high | Recursive function call per node; register spillage; type dispatch inside each node |
| Big-switch VM (C/C++) | Low | Jump table from compiler; usually near-optimal |
| Big-switch VM (Go) | Medium-high | Switch often compiles to binary-search; see concepts/jump-table-vs-binary-search-dispatch |
| Tail-call interpreter | Very low | musttail makes dispatch a single indirect jump; Python 3.14 reports ~30% improvement |
| Callback-slice interpreter (Go) | Low-medium | One indirect call per opcode; no switch; closure captures state |
| JIT native code | ~zero | Straight-line machine code with no dispatch |
The dispatch-overhead-share threshold¶
The PlanetScale post canonicalises a rule of thumb:
"JIT compilers are important for programming languages where their bytecode operations can be optimized into a very low level of abstraction (e.g. where an 'add' operator only has to perform a native x64 ADD). In these cases, the overhead of dispatching instructions becomes so dominant that replacing the VM's loop with a block of JITted code makes a significant performance difference. However, for SQL expressions, and even after our specialization pass, most of the operations remain extremely high level … The overhead of instruction dispatch, as measured in our benchmarks, is less than 20%."
Decision rule:
- Dispatch share >30% of runtime → JIT is justified. The VM can never catch up to native code while dispatch dominates.
- Dispatch share <20% → stay in the VM. JIT adds substantial engineering cost (code generation, relocation, invalidation, security surface, multi-arch) that won't be repaid.
How to measure dispatch cost¶
- Build an alternate implementation of a hot opcode that does no work (returns immediately) and measure the slowdown vs a no-op VM loop — that's dispatch cost.
- Use
perf statcounters (branches,branch-misses,iTLB-load-misses) to characterise how well the dispatch loop plays with the CPU frontend. - Compare median bytecode instruction size (in native instructions the interpreter issues per opcode) against the opcode body size.
Consequences on language design¶
- Coarser opcodes amortise dispatch cost. If an opcode does substantial work (e.g. "match a JSON path", "format a decimal"), dispatch is a small tax. If an opcode does trivial work (e.g. "add two 32-bit ints"), dispatch dominates and JIT becomes the only way forward.
- Stack VMs vs register VMs. Register VMs typically have fewer opcodes per program (2x–5x reduction) because each opcode can reach into operand memory directly, amortising dispatch. Most modern JIT-targeting VMs (Dalvik, LuaJIT) are register-based for this reason.
Seen in¶
- sources/2025-04-05-planetscale-faster-interpreters-in-go-catching-up-with-cpp — canonical benchmark of dispatch cost as a share of VM runtime. Vitess's measurement of <20% dispatch share drives the explicit rejection of JIT for SQL expression evaluation.