CONCEPT Cited by 1 source
Query understanding¶
Definition¶
Query understanding (QU) is the upstream stage of a search pipeline that turns a raw user query string into structured intent signals downstream retrieval + ranking can consume. Canonical QU sub-tasks:
- Query classification — assign the query to one or more taxonomy categories ("butter milk" → Dairy > Milk).
- Query rewrites — produce alternative query strings that expand recall: synonyms, substitutes, broader queries.
- Semantic role labeling (SRL) — extract structured slots from the query (product, brand, attribute, size, quantity).
- Typo/spelling correction, segmentation, language detection, normalization — all upstream of the above.
QU sits between the raw query and the retrieval index. Its quality bounds the quality of everything downstream — a mis-classified query retrieves from the wrong category, a missing rewrite misses recall, a wrong SRL tag breaks filter semantics.
Why QU is hard¶
Instacart's 2025-11-13 Intent Engine post enumerates the structural difficulties (Source: sources/2025-11-13-instacart-building-the-intent-engine):
- Broad queries. "healthy food", "frozen snacks" — span dozens of categories; hard to act on.
- No direct feedback. QU is upstream of clicks/conversions. The nearest labelled signal (user searched
X, purchasedY) is noisy — a user can search "bread" and buy "bananas". - Tail queries. "red hot chili pepper spice" or "2% reduced-fat ultra-pasteurized chocolate milk" appear rarely or never in history; engagement-driven models have no data to learn from.
- System complexity. Separate models for each sub-task — each with its own data pipeline, training infra, serving infra — amplify maintenance cost and prevent shared improvements.
Why LLMs help¶
From the same post: LLMs bring world knowledge and linguistic inference to QU — "an LLM already understands that 'Italian parsley' is a synonym for 'flat parsley', while 'curly parsley' is a common substitute" — reducing the specialised-dataset burden and allowing one model to serve multiple QU sub-tasks, collapsing the system-complexity problem.
Adaptation levers for LLM-based QU¶
Instacart's explicit hierarchy, least to most invasive (Source: sources/2025-11-13-instacart-building-the-intent-engine):
- Prompting — cheap; the LLM sees only the prompt.
- Context-engineering (RAG) — inject domain signals into the prompt at inference time (conversion history, catalog, brand embeddings, session context).
- Fine-tuning — bake domain expertise into weights (e.g., LoRA on top of an open-weights base).
The three also form a cost ladder: prompting has no training cost, RAG has offline-pipeline cost, fine-tuning has training + serving-hardware cost.
Serving architecture pattern¶
Search traffic is power-law distributed over queries, so production QU systems typically adopt a hybrid head/tail architecture:
- Head (common queries) — pre-compute with an expensive offline pipeline; serve from cache.
- Tail (rare/new queries) — serve with a real-time fast model (often a distilled student of the offline pipeline's teacher).
Canonical wiki instance: Instacart Intent Engine SRL routing. Cache serves ~98% of queries; real-time model handles ~2%. See patterns/head-cache-plus-tail-finetuned-model.
Caveats¶
- QU label quality is bounded by the proxy signal. Conversion-based labels carry user-behaviour noise; click-based labels carry position bias. Neither is ground truth for intent.
- QU is deeply category-taxonomy-bound. Rebuilding QU often requires rebuilding the taxonomy too — a hidden dependency that can eat more time than the model changes.
- QU failure modes compound downstream. A wrong category → wrong retrieval scope → bad ranking even with a perfect ranker. QU regressions are often diagnosed as ranking regressions.
Seen in¶
- sources/2025-11-13-instacart-building-the-intent-engine — canonical wiki reference; LLM-powered three-task QU rebuild at Instacart (category classification + rewrites + SRL) with explicit three-lever adaptation hierarchy.
Related¶
- concepts/semantic-role-labeling — one canonical QU sub-task
- concepts/long-tail-query — the traffic shape that forces hybrid QU architectures
- concepts/context-engineering — the middle adaptation lever
- concepts/intent-preserving-query-translation — adjacent: translate query while preserving user intent
- concepts/query-shape / concepts/query-vs-document-embedding — adjacent retrieval-side concepts
- systems/instacart-intent-engine — canonical production instance
- companies/instacart