Skip to content

PATTERN Cited by 1 source

Deterministic intent with ML fallback

Pattern

For command-interpretation surfaces (voice assistants, chat bots, CLI NL translators), route every input through a deterministic intent engine first — hand-authored phrases, exact matches, rule-based resolution — and invoke the LLM only on miss. The LLM is fallback, not foundation.

"Home Assistant would be like, well, I don't have to ask AI… I know what this is. Let me turn off the lights." (Source: sources/2025-12-02-github-home-assistant-local-first-maintainer-profile)

Structure

 utterance
┌────────────────────────────────────┐
│ Stage 1: deterministic intent      │
│ (hand-authored phrase → action)    │
│   • no ML                          │
│   • fully local                    │
│   • O(1) / O(log n) lookup         │
│   • no hallucination surface       │
└──────────────┬─────────────────────┘
               │ miss
┌────────────────────────────────────┐
│ Stage 2: ML / LLM (optional)       │
│ user-selected backend:             │
│   • OpenAI                         │
│   • Google Gemini                  │
│   • local Llama                    │
│   • none (utterance just fails)    │
└────────────────────────────────────┘

Canonical instance — Home Assistant's Assist

Home Assistant shipped Assist "before the AI hype" with this exact structure:

  • Stage 1 — structured intent engine. Community-contributed hand-authored phrase templates matched directly to known actions with no ML at all. Commands like "Turn on the kitchen light" or "Turn off the living room fan" resolve to automations via direct phrase → action mapping. Extremely fast. Fully local. No network calls. No model hallucinations. Just direct mapping.
  • Stage 2 — optional AI when flexibility is needed. AI is never mandatory. Users pick the inference path: "You can even say you want to connect your own OpenAI account. Or your own Google Gemini account. Or get a Llama running locally in your own home." Invoked only when a command requires flexible interpretation the phrase templates don't cover.

Stage-boundary algorithm is not disclosed in the source (exact- phrase? fuzzy? confidence threshold?) — only the behavioural property: "I know what this is, let me turn off the lights" is the deterministic hot path.

Why this shape

Three engineering properties fall out of the "deterministic first, ML second" ordering:

  1. Latency on the hot path. Closed-vocabulary commands dominate home-automation traffic by volume (turn on/off, dim, set temperature, lock/unlock). For these, phrase match is sub-millisecond compared to 100s-of-ms to seconds for an LLM round-trip.

  2. No hallucination surface on the hot path. A phrase that resolves to "living_room.lamp.off" can only trigger that action. An LLM asked the same thing could, in principle, route to the garage door. The deterministic stage eliminates that class of failure for the most-invoked paths.

  3. Local-first compatibility. Stage 1 runs entirely on-box, preserving concepts/local-first-architecture for the command volume that matters. Stage 2 optionally punches through to cloud AI, but only as explicit user opt-in per integration, and only on miss — the cloud doesn't get every command by default.

Sibling pattern

Structurally a cousin to patterns/static-allowlist-for-critical-rules: critical / load-bearing paths bypass the flexible-but-less-predictable mechanism (there, a data-driven allowlist; here, an LLM). Same shape — "keep the hot path deterministic; let the flexible mechanism handle everything else."

Compare also to sources/2026-03-04-datadog-mcp-server-agent-tools's patterns/query-language-as-agent-tool: rather than letting agents infer from raw samples, give them SQL so the interpretation is explicit and grounded. Same family — reduce the LLM's role from "interpret everything" to "handle the residual after a structured surface has done what it can."

Where it applies beyond voice

Any NL surface with:

  • A closed vocabulary for the hot path (a small, enumerable set of actions / intents handles the majority of traffic).
  • A long tail of flexible-interpretation queries that still need coverage.
  • Latency sensitivity on the hot path.
  • Determinism / safety requirements on the hot path.

Examples beyond Home Assistant's voice pipeline: CLI help commands, customer-service chat bots for return / refund / status queries, search bars with an "I'm feeling lucky" exact- intent layer, IDE command palettes.

Open questions

  • How is the deterministic-match threshold chosen? Phrase templates are hand-authored; how tight is the match (exact? fuzzy? stemmed?)? Not disclosed.
  • What happens on Stage-2 ML fallback failure (no provider configured, API down, rate limit)? Graceful error to user? Silent fail? Not disclosed in the ingested source.
  • How are new Stage-1 phrase templates introduced? The source notes these are "hand-authored phrases contributed by the community" — the submission / review / i18n story is not detailed.

Seen in

  • sources/2025-12-02-github-home-assistant-local-first-maintainer-profile — Home Assistant's Assist voice assistant: Stage-1 community-authored phrase templates (no ML, fully local) + Stage-2 user-selected LLM fallback (OpenAI / Gemini / local Llama). "Home Assistant would be like, well, I don't have to ask AI. I know what this is. Let me turn off the lights." First canonical instance in the wiki.
Last updated · 200 distilled / 1,178 read