Skip to content

CONCEPT Cited by 1 source

Semantic Role Labeling (SRL)

Definition

Semantic Role Labeling (SRL) in the e-commerce query-understanding context is the task of extracting structured slots — product, brand, attribute, size, quantity, flavor — from a free-form user query string. The output is a tagged version of the query, e.g.:

"organic whole milk 2 gallon"
  → product:   milk
  → attribute: organic, whole
  → size:      2 gallon

The name is borrowed from classical NLP (where SRL is about assigning semantic-role labels to arguments of a predicate — "who did what to whom"). The e-commerce adaptation repurposes the vocabulary for query tagging: the "roles" are product-taxonomy slots rather than syntactic arguments.

Where SRL tags get used

Instacart's Intent Engine post names four downstream consumers (Source: sources/2025-11-13-instacart-building-the-intent-engine):

  • Search retrieval — use the product tag to constrain the candidate set and the brand / attribute tags to filter.
  • Ranking — boost documents matching extracted tags.
  • Ad targeting — surface sponsored products that match the brand/attribute constraints.
  • Filters / facets — expose SRL output in the UI as toggleable facets.

Why SRL is load-bearing

SRL tags are the structured bridge between unstructured user queries and the structured product catalog. A search index stores documents with typed fields (brand, category, size); the query is raw text. Without SRL, every query has to be matched via text similarity across all fields — noisy, expensive, and loses the signal that "organic" is an attribute filter not a brand search. With SRL, the query gets rewritten into a typed structured query the index can execute efficiently.

The tail-query problem

SRL has a power-law traffic problem: head queries ("bananas", "milk", "bread") are tagged thousands of times a day, tail queries ("red hot chili pepper spice") are tagged rarely or never. Legacy model-based SRL that learns from engagement data fails on the tail because there's no data to learn from.

Instacart's production SRL architecture responds with a hybrid cache + real-time fine-tuned model shape:

  • Head cache — pre-computed tags from an offline RAG pipeline + frontier LLM. Serves ~98% of queries at zero inference latency.
  • Real-time studentLoRA-fine-tuned Llama-3-8B on the head-pipeline's output. Serves ~2% of tail cache-miss traffic at ~300 ms on H100.

See patterns/head-cache-plus-tail-finetuned-model for the general pattern and systems/instacart-intent-engine for the canonical implementation.

Quality posture: precision over recall

For SRL on retrieval-side tags, precision is more valuable than recall. A false-positive tag (brand=MuchPure when the query isn't about MuchPure) biases retrieval toward the wrong products; a false-negative tag just falls back to text similarity — the existing behaviour. Instacart's production 8B student ships "96.4% precision, 95.0% recall" — an explicit +1.0 precision / -1.2 recall trade against the frontier teacher, and the recall drop is accepted because the precision gain is more load-bearing for the downstream search experience.

  • concepts/multi-modal-attribute-extraction — sibling extraction task applied to product catalog (not query), augmented with images. Instacart's PARSE extracts the same slot types (flavor, size, brand, attribute) from product listings; Intent Engine's SRL extracts them from user queries. The two systems produce matching structured vocabularies so retrieval can join on them.
  • Named Entity Recognition (NER) — NLP ancestor; SRL-for-queries can be seen as a domain-specific NER where entity types are taxonomy slots.
  • Slot filling in task-oriented dialog — structurally similar, different domain.

Caveats

  • Tag vocabulary is catalog-coupled. Changing the product taxonomy invalidates the SRL model.
  • Tag disambiguation is brand-name-vs-common-word-dependent ("Apple" the brand vs. apple the fruit). Context-engineering helps but doesn't eliminate this.
  • Precision-recall threshold is a moving product decision. Different downstream consumers (retrieval, ads, filters) have different precision/recall tolerances; shipping one SRL model for all consumers is a compromise.

Seen in

Last updated · 319 distilled / 1,201 read