PATTERN Cited by 1 source

Dynamic knowledge-injection prompt¶

Pattern¶

Rather than rely on web-search RAG (vulnerable to the telephone-game failure mode) or a frozen baseline system prompt (vulnerable to the training-cutoff dynamism gap), detect the intent of the incoming request and inject version-pinned targeted knowledge directly into the system prompt — keeping the injection byte-stable within an intent class to preserve prompt- cache hits.

Canonical Vercel framing¶

"Instead of relying on web search, we detect AI-related intent using embeddings and keyword matching. When a message is tagged as AI-related and relevant to the AI SDK, we inject knowledge into the prompt describing the targeted version of the SDK. We keep this injection consistent to maximize prompt-cache hits and keep token usage low."

(Source: sources/2026-01-08-vercel-how-we-made-v0-an-effective-coding-agent)

The mechanism¶

Intent detection via embeddings + keyword matching. The incoming user message is embedded and compared against pre-computed intent-class embeddings; keyword heuristics act as cheap priors. Output: one-of-N intent labels (e.g. AI-SDK-intent, frontend-framework-intent, integration-intent, generic-intent).
Per-class knowledge pack. Each intent class has a hand-curated, version-pinned knowledge pack describing "the targeted version of the SDK" — API surface, common patterns, deprecated APIs to avoid, etc.
System-prompt assembly. The assembled system prompt is [base prompt] + [intent-class knowledge pack] — the base prompt is byte-stable across all classes; the knowledge pack is byte-stable within a class. This maximises prompt-cache hit rate at the model provider (concepts/prompt-cache-consistency).
Filesystem pointer. In addition to text knowledge, point the model at a read-only filesystem of curated code samples ([[patterns/read-only-curated-example- filesystem]]), letting the model search for concrete patterns on demand.

Why prefer this over web-search RAG¶

Three compounding arguments from Vercel:

No telephone game. A small summariser model in the RAG path can "hallucinate, misquote something, or omit important information." Direct injection skips this hop entirely.
No stale results. Web-search indexes can return outdated blog posts and documentation even when the library has since shipped a new version.
Prompt-cache stability. Direct injection is byte-deterministic within a class; web-search RAG produces different retrieved snippets per request, busting the cache.

When web-search is still useful¶

Web-search RAG remains appropriate for open-domain, fast-moving, unbounded knowledge (current events, market data, user-generated content). Vercel explicitly notes "v0 uses [web search] too" — the direct-injection preference applies to the specific class of library-API knowledge the model is expected to generate code against, where curation is feasible.

Intent-detection implementation notes¶

Embeddings + keywords both — not one or the other. Embeddings capture semantic similarity; keywords catch the long tail ("useQuery", "react-query", "AI SDK") where lexical match is stronger signal than semantic proximity. Ensemble is cheap.
Don't over-partition intent classes. More classes = more cache slots = lower hit rate per slot. Vercel's partitioning is coarse (AI SDK, frontend framework, integrations).
Intent detection itself is small-model-cheap. Embedding lookup + keyword scan is microseconds; this is not the bottleneck.

Trade-offs¶

Curation cost. Knowledge packs must be maintained — by the agent team, ideally with the library vendor (Vercel's v0 + AI SDK team co-maintain the pack + read-only example fs).
Coverage gap. Anything not on the curated list falls back to the model's parametric knowledge — so you want intent classes to map to the set of libraries/APIs that matter most for success rate.
Intent-detection failure. If the request is mis-classified, the wrong knowledge pack is injected or none at all; acceptable failure mode (model falls back to generic prompt) but needs measurement.

Seen in¶

sources/2026-01-08-vercel-how-we-made-v0-an-effective-coding-agent — canonical pattern; intent-detection via embeddings + keyword matching; version-pinned AI-SDK knowledge injection; prompt-cache consistency as explicit design driver.

concepts/training-cutoff-dynamism-gap — the failure mode being fixed.
concepts/prompt-cache-consistency — the design constraint that shapes the pattern.
concepts/web-search-telephone-game — the alternative this pattern avoids.
concepts/context-engineering — broader discipline of shaping LLM context.
patterns/read-only-curated-example-filesystem — complementary pattern (text knowledge + code examples).
patterns/composite-model-pipeline — this is stage 1 of the v0 composite pipeline.
systems/vercel-v0 — canonical consumer.
systems/ai-sdk — canonical target library.