Skip to content

PATTERN Cited by 1 source

Deterministic + model autofixer

Pattern

For post-stream code repairs where the fix requires both an objective invariant check and a judgment about where / how to emit the fix, combine:

  • AST-based deterministic detection — parse the generated code, mechanically check invariants, produce a boolean "needs fix" decision and structured context.
  • Small fine-tuned model for placement / synthesis — a fast model (trained on real generations of the same failure class) that decides where the fix goes and how it's shaped.

Run only when needed, under a tight latency budget.

Canonical Vercel framing

"Sometimes, there are issues that our system prompt and LLM Suspense cannot fix. These often involve changes across multiple files or require analyzing the abstract syntax tree (AST). For these cases, we collect errors after streaming and pass them through our autofixers. These include deterministic fixes and a small, fast, fine-tuned model trained on data from a large volume of real generations."

(Source: sources/2026-01-08-vercel-how-we-made-v0-an-effective-coding-agent)

The three disclosed examples

1. QueryClientProvider wrapping (hybrid — AST + model)

  • useQuery / useMutation from @tanstack/react-query require a QueryClientProvider ancestor.
  • AST parse to verify whether they're wrapped — objective yes/no.
  • Autofix model decides where to add the provider if missing — the right answer depends on the app's render-tree shape, which requires judgment.

2. package.json completion (fully deterministic)

  • Scan the generated code's import statements to enumerate used modules.
  • Diff against the package.json dependency list.
  • Deterministically add missing entries (and their versions per a known registry).
  • No model call required.

3. JSX / TypeScript syntax repair (partially hybrid)

  • Deterministic TS / JSX parse detects syntax errors.
  • For the errors that have a single unambiguous fix (e.g. missing closing bracket), apply the fix deterministically.
  • For errors requiring judgment (which import path to complete, which type annotation), invoke the fine- tuned model.

Design principles

  1. Detection is deterministic; synthesis may be probabilistic. Parsing is cheap, accurate, and reproducible. Use it to decide whether a fix is needed before spending a model call. This keeps the model invocations rare and tightly bounded.

  2. Latency-gated behaviour. "These fixes run in under 250 milliseconds and only when needed, allowing us to maintain low latency while increasing reliability." Most requests skip the autofixer entirely; the latency cost is paid only on the problem cases.

  3. Train on real failures. "Trained on data from a large volume of real generations." The autofixer model isn't a general-purpose coder — it's a specialist trained on the empirical distribution of failures actually observed in production. A small specialist is faster and often more accurate on its narrow task than a larger general-purpose model.

  4. Cross-file scope is the autofixer's territory. Streaming rewrite ([[patterns/streaming-output- rewrite]]) can only see the current token; autofixers see the whole generated artifact and can reason across files. Use the right altitude for the problem.

Latency profile

  • <250 ms total per fix (from the Vercel disclosure).
  • Runs conditional on AST-detected invariant violation; most requests don't trigger it.
  • Runs post-stream, after the model has finished emitting, so streaming latency is unaffected.

Trade-offs

  • Pipeline complexity. An AST parser + a fine-tuned model + deterministic rules is more moving parts than a single bigger model. Justified by the failure-mode specificity: each component is faster and more accurate on its narrow task.
  • Maintenance overhead. The fine-tuned model must be retrained as library conventions change; the AST parser must track language-version updates.
  • Judgment boundary. Drawing the line between deterministic-fix and model-fix requires calibration; when the line is wrong, you either over-invoke the model (cost/latency regression) or under-invoke (missed fix).

Contrast with streaming rewrite

Property Streaming rewrite Post-stream autofixer
When it runs During generation After stream completes
Scope Per-token / local Whole artifact, cross-file
Model calls None (pure lookup) Small fine-tuned model when needed
Latency budget <100 ms per substitution, unblocking <250 ms conditional
User sees broken state? No No (applied before render)
Best for Import rewrites, name resolution, token compression AST invariants, missing deps, cross-file fixes

Seen in

Last updated · 476 distilled / 1,218 read