PATTERN Cited by 1 source

In-code annotation as LLM guidance¶

Pattern¶

When using an LLM for code transformation, write the guidance you would otherwise put in the system prompt as in-code comments at the call sites where it applies. A deterministic pre-pass (typically an AST codemod) is responsible for detecting the site and emitting the annotation.

Instead of prompts that say "when you see wrapper.find, consider getByRole if the element has an ARIA role, else getByTestId if there's a data-testid, else getByText for visible text", the AST pass writes:

// TODO: Replace wrapper.find('SubmitButton') with RTL query.
// Suggested: screen.getByRole('button', { name: /submit/i })
// See https://docs/rtl/query-priority for priority order.
wrapper.find('SubmitButton').simulate('click');

directly next to the call site. The LLM, given this annotated file, converts each call using the site-specific guidance it can see inline.

Forces¶

System prompts are global; call sites are local. Guidance that has to fire at the right call site competes with all other guidance in the system prompt for the model's attention. Attaching it to the call site eliminates that competition.
Prompt size isn't free. Dumping a big rulebook into every prompt burns tokens on rules that don't apply to this particular file.
LLM attention is site-local. Models are much better at using context that sits near the token they're generating than at remembering rules from thousands of tokens ago in a long system prompt.
Deterministic detection is cheap. An AST pass can precisely identify each call site; a prompt relying on the LLM to find them itself relies on pattern-matching that may miss or over-fire.
Hallucination mitigation compounds. Every site the AST annotates is a site where the LLM is being shown the answer, not asked to guess at it.

Slack's framing¶

From the 2024-06 Enzyme→RTL retrospective:

"The second and arguably more effective approach we used to control the output of the LLM was the utilization of AST transformations. This method is rarely seen elsewhere in the industry. Instead of solely relying on prompt engineering, we integrated the partially converted code and suggestions generated by our initial AST-based codemod. The inclusion of AST-converted code in our requests yielded remarkable results. By automating the conversion of simpler cases and providing annotations for all other instances through comments in the converted file, we successfully minimized hallucinations and nonsensical conversions from the LLM. This technique played a pivotal role in our conversion process." (Source: sources/2024-06-19-slack-ai-powered-conversion-from-enzyme-to-react-testing-library)

Note the framing: annotations aren't just convenience for the LLM, they're primarily a hallucination-control mechanism.

Two flavours of annotation¶

Handled case, shown-as-example: the AST pass fully converts the call and leaves it in place. The LLM sees a complete RTL call next to all the Enzyme calls it still needs to convert — inline examples of the target shape, styled exactly the way the codebase wants them.
Unhandled case, flagged-with-suggestion: the AST pass leaves the Enzyme call in place but adds a preceding comment describing what the LLM should replace it with, which context to consult (DOM / React component source / doc URL), and any gotchas (e.g. "this one is asynchronous; use findBy* not getBy*").

Both flavours are load-bearing — the first sets the target style, the second steers the LLM through specific traps.

Contrast: system-prompt approach¶

System-prompt-only guidance for the same migration would look like a 20-item rulebook pasted at the top of every request. Slack's structured prompt does carry such guidance, but as a companion to the in-code annotations, not a replacement. The prompt handles universal rules ("preserve test count", "convert all Enzyme methods to RTL equivalents"); the annotations handle site-specific decisions ("this find should be getByRole('button') because the DOM shows a <button> element").

This split corresponds to the attention-split in how LLMs actually work — global instructions set policy, local context sets actions.

Generalisation¶

Applies anywhere an LLM is performing site-by-site transformation under a large ruleset:

Refactoring deprecated API usage with a type-checker writing annotations at each deprecated call.
Language ports where the static type checker can emit type hints inline for the LLM to use when rewriting.
Framework migrations where a lint rule annotates each violation with the preferred fix.
Import-path rewrites where a module resolver annotates ambiguous imports with the correct target path.

The common shape: a deterministic tool knows something about the call site the LLM would have to guess; the tool writes what it knows inline, the LLM uses it.

Consequences¶

Positive:

Reduces hallucination at source (the LLM doesn't have to guess).
Composes naturally with patterns/ast-plus-llm-hybrid-conversion — the AST pass is already walking the file.
Improves incrementally: new AST rules add new annotations; LLM quality rises without any prompt changes.
Produces a debuggable artifact — the annotated intermediate file is human-readable.

Negative:

The AST pass has to be written and maintained. For one-off migrations this may not pay back.
Annotations in the final output (if not stripped) are cruft; pipelines must remove them before merge.
Bad annotations (wrong suggestion, stale doc link) actively mislead the LLM.

patterns/ast-plus-llm-hybrid-conversion — the umbrella pattern this primitive is a part of
concepts/abstract-syntax-tree — the deterministic pre-pass that writes annotations
concepts/llm-conversion-hallucination-control — the structural problem class
concepts/llm-hallucination — the failure mode
systems/enzyme-to-rtl-codemod — canonical production instantiation