PATTERN Cited by 1 source
Hyperlink allowlist validation on LLM output¶
Pattern shape¶
When an LLM generates a response that includes hyperlinks, validate every URL in the output against an allowlist extracted from the retrieved-context articles. Any URL not on the allowlist is stripped, rejected, or replaced. The allowlist is built per response (not globally) — it is the union of hyperlinks present in the specific articles that grounded that response.
The pattern targets LLM hyperlink hallucination — the failure mode where the LLM fabricates plausible-looking URLs that don't exist in the source corpus and don't resolve.
Three structural pieces¶
- Per-response allowlist extraction. When the RAG retriever returns the top-K articles for a query, extract every hyperlink (URL string, not anchor text) from the article bodies. This forms the per-response allowlist — a small set, typically dozens of URLs.
- LLM output URL parsing. Parse the LLM's generated text
for URL-shaped tokens (markdown link
[text](url), raw URL, HTML anchor, depending on the output format). Extract the URL strings. - Validation + remediation. For each extracted URL, exact-match against the allowlist. Strip, reject, or replace any URL not on the allowlist. Yelp doesn't disclose the exact remediation policy; reasonable choices:
- Strip: remove the link markup, keep the anchor text.
- Replace: substitute with a known-valid fallback (e.g. the article URL of the top-1 retrieved article).
- Reject: regenerate the response with stronger prompt constraints.
Canonical instance — Yelp CS Chatbot (2026-05-27)¶
Verbatim from the post:
"One of the most notable unexpected challenges was the tendency of Large Language Models (LLMs) to hallucinate hyperlinks frequently. Since our knowledge base articles contain numerous hyperlinks, and we intended for the LLM-generated responses to include accurate links, this required a dedicated solution. To counteract this, we developed a process to reliably retrieve valid hyperlinks from the source articles and integrated specific validation checks. This verification process ensures that any link included in the final response genuinely originates from one of the retrieved Support Center articles and is not invented by the LLM." (Source: sources/2026-05-27-yelp-beyond-the-menu-tree-how-yelp-built-a-smarter-customer-success-chatbot)
The pattern is one of three output-validation checks in Yelp's QA workflow gate: trust & safety, valid URL, character limit. Together they form the three-axis output gate that runs after LLM generation and before delivery to the user.
Why prompt-engineering alone doesn't work¶
Common naive mitigations and their failure modes:
- "Only use links from the provided context" — model ignores or partially obeys; path-from-A-query-from-B mix-and-match URL fabrication remains.
- Few-shot URL examples in prompt — model overfits to the example URL structure and fabricates novel URLs that match the pattern.
- Stop-tokens around URLs — breaks markdown link syntax; doesn't actually prevent fabrication.
The structural property that breaks prompt engineering: URL correctness is binary (resolves or doesn't), unlike factual claims where graded faithfulness is achievable through better prompting. Binary-correctness output classes require deterministic post-hoc validation — the same insight as patterns/embedding-based-name-resolution (Vercel v0, 2026-01-08) for icon-name hallucination.
Comparison with adjacent post-hoc-validation patterns¶
- vs patterns/embedding-based-name-resolution (Vercel v0, 2026-01-08) — both fix binary-correctness symbolic-space hallucinations via deterministic post-hoc rewrite. Vercel v0 uses embedding similarity to map a hallucinated symbol to the closest valid name (icons live in a meaningful semantic space). Yelp uses exact allowlist match because URL strings have no useful similarity space — a mistyped path segment is either right or wrong.
- vs patterns/streaming-output-rewrite — Vercel v0 does the embedding-based name resolution inside the token stream (LLM Suspense). Yelp's hyperlink validation appears to run after generation completes (it's part of the validation-gate post-LLM). Whether streaming variants are feasible is product-dependent.
- vs patterns/critic-tool-call-introspection-suite — Slack Spear's per-finding-introspection auditing is an upstream auditing primitive; hyperlink-allowlist is a downstream gate. Different placement, different scope.
When to apply¶
Use when:
- LLM generates text intended for direct user consumption that includes URLs.
- The set of valid URLs is knowable a priori — either a fixed corpus (knowledge-base URLs) or extractable from the retrieved RAG context.
- Hallucinated URLs would break user trust (404s, wrong destination) or leak sensitive data (LLM emitting an internal URL that should not be publicly linked).
Don't use when:
- The valid-URL set is unbounded (e.g. the LLM is genuinely expected to suggest external URLs based on user intent).
- The LLM is in an agentic flow where URL emissions are tool calls (the URL is consumed by a downstream tool, not by a human; tool-side validation is more appropriate).
Risks¶
- Allowlist scope drift. What counts as "from the source articles" — only the article-page URLs, or every hyperlink inside the article body? Yelp's verbatim "valid hyperlinks from the source articles" implies the latter, but precise extraction logic isn't disclosed.
- External-link policy. What about cases where the right answer does include an external URL (e.g. a partner site)? Need explicit allowlist policy for cross-domain links.
- Replacement-URL plausibility. If the strategy is replace hallucinated URL with the top-1 retrieved article URL, the user may receive a link to an article that doesn't quite match the anchor text. Strip-only is safer but less helpful.
- No quantitative residual rate disclosed. Yelp doesn't publish how often hallucinations leak past the allowlist validation (e.g. if the model emits a URL that happens to appear in the retrieved articles but doesn't actually answer the query — semantically wrong even if formally valid).
- Performance. Per-response allowlist build + URL parse + validation is a non-trivial CPU cost on every response. For high-QPS chatbots, consider caching extracted-allowlist by retrieved-article-set.
Composes with¶
- concepts/retrieval-augmented-generation — provides the retrieved-context articles from which the allowlist is extracted.
- patterns/whole-article-retrieval-via-metadata-segments — whole-article retrieval makes the per-response allowlist trivially the union of in-article links. Chunk retrieval fragments the allowlist.
- concepts/llm-hyperlink-hallucination — the failure mode this pattern targets.
Seen in¶
- sources/2026-05-27-yelp-beyond-the-menu-tree-how-yelp-built-a-smarter-customer-success-chatbot — canonical: per-response hyperlink allowlist extraction from retrieved RAG context, validation gate before user delivery.
Related¶
- concepts/llm-hyperlink-hallucination — the failure mode.
- concepts/llm-hallucination — parent failure-mode concept.
- concepts/retrieval-augmented-generation — the architectural setting.
- patterns/embedding-based-name-resolution — sibling binary-correctness symbolic-space hallucination fix-up.
- patterns/streaming-output-rewrite — sibling token-stream rewrite layer.
- systems/yelp-cs-chatbot — canonical wiki instance.