PATTERN Cited by 1 source

Specialized workflow router with LLM intent detection¶

Pattern shape¶

An LLM classifies inbound user requests into one of N pre-defined workflows; each workflow has its own specialised handler independently designed to do the work. The handlers need not all use the LLM — they can be deterministic UIs, form-flow guides, templated canned responses, RAG-driven LLM generation, or any mix. Workflow buckets are picked along two axes: frequency of inbound requests AND risk class (legal, financial, churn, safety). High-risk request shapes get deterministic / templated handlers that bypass LLM generation entirely.

Structure¶

┌────────────────────────┐
│   User query           │
└──────────┬─────────────┘
           │
           ▼
┌────────────────────────┐
│   LLM intent router    │ ← classifies into 1 of N workflows
└──────────┬─────────────┘
           │
   ┌───────┴───────┬───────────┬───────────┬───────────┐
   ▼               ▼           ▼           ▼           ▼
┌──────┐    ┌──────────┐  ┌──────────┐ ┌─────────┐ ┌─────────┐
│ QA   │    │ Billing  │  │ Refund   │ │ Cancel  │ │ Review  │
│ RAG  │    │ Det. UI  │  │ Form     │ │ Template│ │ Template│
│ LLM  │    │ no-LLM   │  │ guide    │ │ no-LLM  │ │ no-LLM  │
└──────┘    └──────────┘  └──────────┘ └─────────┘ └─────────┘
   │             │             │           │            │
   │             │             │           │            │
   │             │             ▼           ▼            ▼
   │             │       ┌────────────────────────────────┐
   │             ├──────▶│  Output to user                │
   │             │       │  (Billing / Form / Templated)  │
   │             │       └────────────────────────────────┘
   ▼             ▼
┌─────────────────────────┐
│  Validation gate         │ ← LLM-generated content only
│  (T&S / URL / char-limit)│
└──────────┬──────────────┘
           ▼
┌─────────────────────────┐
│  Output to user         │
└─────────────────────────┘

Canonical instance — Yelp CS Chatbot (2026-05-27)¶

Yelp's LLM-Assisted Customer Success Chatbot routes inbound queries into five named workflows:

Workflow	Handler shape	Risk class
Question/Answering (QA)	RAG-driven LLM generation	Standard
Billing	Deterministic UI (subscribed services + balances)	Low
Refund	Form-submission guide	Standard
Cancel	Templated response (no LLM generation)	High financial / legal
Review	Templated response (no LLM generation)	High financial / legal

Only ONE of the five workflows (QA) has the LLM actually generate free-form text. Verbatim from the post:

"We bucketed the workflows based on the frequency of inbound requests along with the potential risks of the queries (e.g. churn risk, legal risk, and financial risk). When a user submits a query, the system uses the LLM to intelligently detect which workflow the query should follow." (Source: sources/2026-05-27-yelp-beyond-the-menu-tree-how-yelp-built-a-smarter-customer-success-chatbot)

A/B-test outcome vs the legacy menu-tree+fixed-phrase chatbot: doubled resolution rate.

Three structural pieces¶

LLM intent classifier — short prompt asking the LLM to pick one of N workflows for a given user query. Prompt includes workflow names, brief descriptions, and ideally a default fallback.
Workflow-specific handlers — one per workflow. Independently designed; some are deterministic, some template-driven, some RAG-driven. Handlers don't share prompts, error handling, or output format constraints.
Workflow-specific output gate — applies only when the handler itself uses LLM generation. Templated / deterministic handlers skip the gate. Yelp's QA gate is three-axis: trust & safety / valid URL / character limit.

When to apply¶

Use this pattern when:

Inbound queries span distinct request shapes with different cost / risk / SLA properties.
Some request shapes have legal / financial / churn risk that makes LLM generation unsafe; bypassing LLM generation for those is desirable.
Some request shapes are deterministic (Billing details, account balances) — LLM-generated answers there add risk without value.
The LLM's classification capability is more reliable than its generation capability for the domain.

Don't use this pattern when:

Queries are uniformly free-form and can't be cleanly bucketed.
Handler diversity adds operational overhead that exceeds the per-workflow specialisation benefit.
Workflow-mis-routing cost is higher than monolithic-LLM generation cost (rare, but possible in adversarial domains).

Trade-offs¶

Operational complexity ↑ — N handlers to maintain. Mitigated by handler independence (changes to one don't regress others).
Latency ↑ — every query pays an LLM-classification cost before the handler runs. Mitigated by using a small fast model (e.g. GPT-4o-mini) for the router and reserving larger models for QA generation.
Risk surface area ↓↓ — high-risk workflows entirely bypass LLM generation. The blast radius of an LLM hallucination is bounded to the QA workflow only.
Engineering velocity per handler ↑ — handler teams own their slice; can iterate independently.
User experience — uniform front (chat interface) over heterogeneous back (handlers). User doesn't know the request was routed.

Risks¶

Mis-routing. Router classifies Cancel intent as QA; QA-RAG generates an LLM response that should have been a templated risk-mitigation message. Mitigation: conservative router prompts; bias toward routing into safer/templated workflows when uncertain.
Coverage gaps. N workflows can't cover everything; the default fallback (typically QA) absorbs the residual. Mitigation: monitor router output distribution; add workflows when out-of-distribution queries cluster.
Workflow ambiguity. Queries that span workflows ("I want to cancel and get a refund") need explicit disambiguation policy.
Router accuracy regression. Re-prompting the router (or upgrading the underlying model) can shift the workflow distribution. Track per-workflow rate as an SLI.

concepts/llm-workflow-router — the underlying concept.
patterns/intent-domain-decomposer-agentic-router — decomposer-router sibling (Databricks World Bank Group). Decomposes a query into multiple sub-queries routed to different domain agents; Yelp routes each query to exactly one handler.
patterns/multi-agent-supervisor-routing — supervisor routing in multi-turn agent flows (different time horizon).
concepts/retrieval-augmented-generation — the QA handler shape.
concepts/llm-hallucination — the failure mode that motivates removing LLM from high-risk workflows.
systems/yelp-cs-chatbot — canonical wiki instance.

Seen in¶

sources/2026-05-27-yelp-beyond-the-menu-tree-how-yelp-built-a-smarter-customer-success-chatbot — canonical: 5 workflows, two-axis bucketing, doubled resolution rate vs legacy menu-tree chatbot.