PATTERN Cited by 1 source
Structured output grammar for valid plans¶
Shape¶
When an LLM agent must produce a structured object with correctness constraints beyond well-formedness — e.g. a query plan, a scheduling decision, a config edit — express those constraints as a grammar passed to the model's structured- output decoder. The grammar admits only valid objects before generation, eliminating the failure mode where the agent produces syntactically-valid but semantically-invalid output.
This is distinct from, and stronger than, reliability-style JSON schema validation:
| Axis | Structured-output reliability | Grammar-for-validity |
|---|---|---|
| Goal | "Parseable JSON" | "Parseable JSON that is also a valid plan" |
| Constraint type | Syntax + type | Syntax + type + semantic invariants |
| Enforcement | Decoder produces JSON; app validates | Decoder produces only valid objects by construction |
| Failure mode solved | Malformed JSON → dropped examples | Valid JSON but invalid plan → wasted rollout |
Canonical instance¶
Databricks' join-order agent (Source: sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization):
"Each tool call generates a join ordering using structured model outputs, which forces the model's output to match a grammar we specify to only admit valid join reorderings."
A join-order is semantically valid iff:
- Every table in the query appears exactly once.
- The binary-tree structure is well-formed.
- Associativity/commutativity constraints are preserved.
A free-form generation would frequently produce orderings that miss a table, duplicate one, or reference a non-existent table — all valid JSON, all useless. The grammar pre-emptively eliminates these, so every rollout lands on a semantically- legal plan the execution engine can actually run.
When this fits¶
| Condition | Why |
|---|---|
| Output is a structured object (not prose) | Grammar-constrained decoding only makes sense for structured output |
| Semantic validity is grammar-expressible | Not every correctness property is context-free; e.g. "every table appears exactly once" can be encoded by careful state-machine grammar |
| Validity failures are costly | Each invalid output wastes a rollout (or worse, corrupts a downstream system) |
| Grammar can be inferred from schema | You can auto-generate the grammar from a SQL schema + join graph, not hand-maintain it |
When it doesn't fit¶
- Outputs are prose with style constraints. Natural-language outputs rarely benefit from grammar constraints; style is better shaped by prompting.
- Validity is defined by runtime behaviour. E.g. "the plan must not OOM" — not grammar-expressible; falls back to execute-and-check (which is what the outer pattern does anyway).
- The grammar would be as complex as the full semantics. If the grammar approaches the complexity of a type checker, it's easier to post-hoc validate and retry.
Implementation notes¶
- Modern frontier-model APIs (OpenAI, Anthropic, Gemini) expose grammar-constrained or schema-constrained decoding surfaces (JSON Schema, Pydantic, context-free grammars). Which surface you use depends on the validity property's complexity.
- Token-level constraints can impact quality: too-tight a grammar may force the model into paths it can't reason well about. Empirically test against free-generation-plus- validation as a baseline.
- Context-free grammars can encode table-uniqueness via explicit enumeration of remaining tables at each step — but this is quadratic in the grammar size. Alternatives: generate a sequence, then reject-and-retry if invalid.
Composition¶
| Pattern | Relationship |
|---|---|
| patterns/llm-agent-offline-query-plan-tuner | Outer pattern — this is the validity leg |
| concepts/structured-output-reliability | Sibling; same decoder feature used for parseability, not validity |
| patterns/tool-call-loop-minimal-agent | Natural fit — narrow-tool agents especially benefit since validity errors waste tool calls |
Seen in¶
- sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization — Canonical first wiki instance. Grammar admits only valid join reorderings; every tool call starts with a legal plan.
Related¶
- concepts/structured-output-reliability — sibling axis (parseability)
- concepts/llm-agent-as-query-optimizer — the domain application
- concepts/join-order-optimization — the specific validity property (valid orderings)
- patterns/llm-agent-offline-query-plan-tuner — the containing pattern
- systems/databricks-join-order-agent — the prototype