Skip to content

PATTERN Cited by 1 source

Static prompt chain over agent loop

Problem. You need to build an LLM-backed application for a domain where precision matters, the step structure is known in advance, and the downside of a wrong answer is non-trivial (operational on-call triage, compliance workflow, medical or legal draft). An agent loop with tool use and autonomous planning is possible, but its failure modes — tool selection errors, hallucinated steps, context rot, loop-escape bugs — are hard to bound. Your team has not yet built a strong enough evaluation + guardrail stack to absorb those failure modes.

Solution. Build a static prompt chain: hardcode the multi-step pipeline in code; the LLM is a reasoning component inside a deterministic workflow, not the orchestrator. Each step is its own LLM call with its own system prompt + user prompt template + output contract; the caller parses step N's reply and builds step N+1's prompt. No function calling, no MCP tool use, no short- or long-term memory, no RAG unless the retrieval is embedded as a deterministic pre-step.

Canonicalised by Expedia STAR (2026-04-28) for incident root-cause analysis. STAR's explicit rationale:

"While AI agents and chatbots are gaining traction, we aimed to start with something a) simple, b) precise (to a certain extent, considering the potential hallucinations of the models), and c) that avoids the additional and currently less understood failure modes of an agent." (Source: sources/2026-04-28-expedia-expedias-service-telemetry-analyzer)

The three dimensions

Dimension Agent loop Static chain
Ordering LLM plans next step Fixed in code
Tool use Function calling / MCP at LLM's discretion Tools (if any) called deterministically by caller
Memory Conversation + scratchpad evolves Each call is stateless except for the chain's explicit context threading
Context envelope Potentially unbounded Statically computable (see concepts/token-heavy-system)
Failure modes Tool-selection error, planning loop, hallucinated step, context rot Parse error at step boundary (bounded; single-step-retry is the recovery)
Evaluation Hard (many branches) Per-step eval + end-to-end eval are both tractable

Canonical Expedia STAR shape

(deterministic) collect telemetry  ──▶  step 1: no LLM
(LLM)   per-metric analysis       ──▶  step 2: N LLM calls in parallel
         │                               (role + format per metric class)
(LLM)   aggregated RCA             ──▶  step 3: single LLM call
         │                               (previous outputs as generated
         │                                knowledge; see
         │                                [concepts/generated-knowledge-prompting](<../concepts/generated-knowledge-prompting.md>))
(deterministic) return response    ──▶  step 4: no LLM

Ordering, step count, and per-step prompts are fixed. The LLM does not decide what to do next.

Why this beats "start with an agent"

  • Evaluation is tractable. Per-step evaluations are well-scoped; regressions are localised to the step.
  • Token budget is static — see concepts/token-heavy-system. The chain's envelope is computable before a single production call.
  • Failure modes are known. A parse error at a step boundary is a cleaner failure than an agent that loops forever or calls the wrong tool. The caller can retry the single step with a stricter reminder prompt.
  • You can graduate later. STAR's explicit roadmap adds MCP tool use, dependency-graph context, and conversational UI as future work, once the operational envelope has proven out and the evaluation stack has caught up.

When this is the wrong choice

  • The problem shape is genuinely exploratory. An RCA that requires N tool hops whose order depends on intermediate findings doesn't fit a fixed chain. Agent-with-tools wins.
  • The model must pick between many possible actions. A chain doesn't help if the "what do I do next?" question is the entire problem.
  • The domain tolerates confident wrongness. If you can absorb agent failure modes, the flexibility upside is real.

Seen in

  • Expedia STAR (2026-04-28) — canonical wiki instance. STAR ships a production RCA service that is explicitly a static prompt chain rather than an agent, with the trade-off named in the post's design section. The five named use cases (incident investigation, post-incident RCA, troubleshooting runbooks, performance optimization, failure-injection evaluation) all run through the same fixed chain.
Last updated · 433 distilled / 1,256 read