CONCEPT Cited by 1 source
LLM code-generation error rate¶
Definition¶
The LLM code-generation error rate is the fraction of code generations from a language model that fail to produce a working artifact (a compiling program, a rendering website, a passing test, or similar) when the model runs in isolation — without post-processing, autofixers, or wrapper pipelines.
Canonical Vercel disclosure¶
"In our experience, code generated by LLMs can have errors as often as 10% of the time."
(Source: sources/2026-01-08-vercel-how-we-made-v0-an-effective-coding-agent)
This is the baseline v0's composite pipeline is built against — and is the first concrete numeric disclosure of LLM code-generation reliability on the wiki from a major production agentic-code system.
Why it matters for system design¶
A ~10 % error rate means roughly one in ten user interactions lands on a broken artifact if the model is treated as the final stage of the pipeline. For a high-volume code-generation product, that is a production-blocking UX failure, not a latent tail problem. It is the specific number that motivates:
- patterns/composite-model-pipeline — the thesis that reliability is a pipeline problem, not a single-model problem.
- patterns/deterministic-plus-model-autofixer — the hybrid of AST-based deterministic checks (which detect invariant violations objectively) and small fine-tuned models (which judge where to emit fixes).
- patterns/streaming-output-rewrite — the stream- manipulation layer that catches failure modes during generation, so the user never sees the intermediate broken state.
Vercel's composite-pipeline improvement¶
Vercel discloses a double-digit percentage-point increase in success rate from their composite pipeline over the ~10 % baseline, but does not disclose the resulting absolute success rate. A "double-digit increase" on top of 90 % is interpretable as reaching the high 90s or similar — the precise post-pipeline number is deliberately withheld.
Failure classes the ~10 % covers¶
The Vercel post enumerates the failure classes the composite pipeline has to address (each is a subset of the ~10 %):
- Outdated SDK API usage — model emits code for a frozen-at-training-cutoff version of a library (concepts/training-cutoff-dynamism-gap).
- Non-existent library symbols — e.g. icon hallucination against a churning namespace.
- Missing provider wrapping — e.g.
useQuerywithout aQueryClientProviderancestor. - Missing package.json entries — imported modules not declared as dependencies.
- JSX / TypeScript syntax errors that slip past the streaming-rewrite layer.
Seen in¶
- sources/2026-01-08-vercel-how-we-made-v0-an-effective-coding-agent — first wiki disclosure; ~10 % baseline; double-digit improvement from composite pipeline.
Related¶
- concepts/llm-hallucination — parent failure-mode category; most ~10 % errors are hallucinations against library surfaces that have drifted since training.
- systems/vercel-v0 — production system built against this baseline.
- patterns/composite-model-pipeline — architectural response to the error rate.
- patterns/deterministic-plus-model-autofixer — one of the response mechanisms.