PATTERN Cited by 1 source

Structured output grammar for valid plans¶

Shape¶

When an LLM agent must produce a structured object with correctness constraints beyond well-formedness — e.g. a query plan, a scheduling decision, a config edit — express those constraints as a grammar passed to the model's structured- output decoder. The grammar admits only valid objects before generation, eliminating the failure mode where the agent produces syntactically-valid but semantically-invalid output.

This is distinct from, and stronger than, reliability-style JSON schema validation:

Axis	Structured-output reliability	Grammar-for-validity
Goal	"Parseable JSON"	"Parseable JSON that is also a valid plan"
Constraint type	Syntax + type	Syntax + type + semantic invariants
Enforcement	Decoder produces JSON; app validates	Decoder produces only valid objects by construction
Failure mode solved	Malformed JSON → dropped examples	Valid JSON but invalid plan → wasted rollout

Canonical instance¶

Databricks' join-order agent (Source: sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization):

"Each tool call generates a join ordering using structured model outputs, which forces the model's output to match a grammar we specify to only admit valid join reorderings."

A join-order is semantically valid iff:

Every table in the query appears exactly once.
The binary-tree structure is well-formed.
Associativity/commutativity constraints are preserved.

A free-form generation would frequently produce orderings that miss a table, duplicate one, or reference a non-existent table — all valid JSON, all useless. The grammar pre-emptively eliminates these, so every rollout lands on a semantically- legal plan the execution engine can actually run.

When this fits¶

Condition	Why
Output is a structured object (not prose)	Grammar-constrained decoding only makes sense for structured output
Semantic validity is grammar-expressible	Not every correctness property is context-free; e.g. "every table appears exactly once" can be encoded by careful state-machine grammar
Validity failures are costly	Each invalid output wastes a rollout (or worse, corrupts a downstream system)
Grammar can be inferred from schema	You can auto-generate the grammar from a SQL schema + join graph, not hand-maintain it

When it doesn't fit¶

Outputs are prose with style constraints. Natural-language outputs rarely benefit from grammar constraints; style is better shaped by prompting.
Validity is defined by runtime behaviour. E.g. "the plan must not OOM" — not grammar-expressible; falls back to execute-and-check (which is what the outer pattern does anyway).
The grammar would be as complex as the full semantics. If the grammar approaches the complexity of a type checker, it's easier to post-hoc validate and retry.

Implementation notes¶

Modern frontier-model APIs (OpenAI, Anthropic, Gemini) expose grammar-constrained or schema-constrained decoding surfaces (JSON Schema, Pydantic, context-free grammars). Which surface you use depends on the validity property's complexity.
Token-level constraints can impact quality: too-tight a grammar may force the model into paths it can't reason well about. Empirically test against free-generation-plus- validation as a baseline.
Context-free grammars can encode table-uniqueness via explicit enumeration of remaining tables at each step — but this is quadratic in the grammar size. Alternatives: generate a sequence, then reject-and-retry if invalid.

Composition¶

Pattern	Relationship
patterns/llm-agent-offline-query-plan-tuner	Outer pattern — this is the validity leg
concepts/structured-output-reliability	Sibling; same decoder feature used for parseability, not validity
patterns/tool-call-loop-minimal-agent	Natural fit — narrow-tool agents especially benefit since validity errors waste tool calls

Seen in¶

sources/2026-04-22-databricks-are-llm-agents-good-at-join-order-optimization — Canonical first wiki instance. Grammar admits only valid join reorderings; every tool call starts with a legal plan.

concepts/structured-output-reliability — sibling axis (parseability)
concepts/llm-agent-as-query-optimizer — the domain application
concepts/join-order-optimization — the specific validity property (valid orderings)
patterns/llm-agent-offline-query-plan-tuner — the containing pattern
systems/databricks-join-order-agent — the prototype