Skip to content

CONCEPT Cited by 1 source

Context-aware retrieval

Definition

Context-aware retrieval is the RAG refinement where the retrieval query is enriched with case-specific structural metadata (tenant, jurisdiction, document type, risk level, product family, transaction size, etc.) before the vector similarity search runs, so that the top-k results are not just semantically close to the natural-language question but also filtered by attributes that determine which corpus slice is relevant.

(Source: sources/2026-04-23-aws-modernizing-kyc-with-aws-serverless-solutions-and-agentic-ai.)

Shape

agent question + case metadata ──► retrieval query
                                   (question_embedding,
                                    metadata_filter = {
                                      jurisdiction: "EU",
                                      document_type: "passport",
                                      risk_level: "medium"
                                    })
                               vector store
                    top-k chunks where semantic_score is high
                                 AND metadata matches

This differs from plain-vector RAG in that metadata filters are mandatory pre-conditions, not soft preferences. Without them, a KYC question like "what identity documents satisfy this case?" retrieves generic global guidance rather than EU-specific rules for this customer's passport type.

Canonical framing from the KYC architecture

"Context-aware retrieval enriches queries with case-specific information, including customer jurisdiction, document types, and risk levels – facilitating highly relevant regulatory guidance." (Source: same post.)

The KYC architecture uses three metadata axes explicitly: - Customer jurisdiction — EU / US / SG / multi — filters the applicable regulatory framework (AMLD vs BSA vs MAS vs FATF). - Document type — passport / national ID / driver's license / proof of address — filters the OCR + verification rubric. - Risk level — high / medium / low — filters policy-matched escalation / verification requirements.

Why plain-vector RAG is insufficient here

Pure cosine-similarity retrieval has three failure modes that context-aware retrieval is designed to eliminate in regulated domains:

  1. Cross-jurisdiction leakage. An EU-jurisdiction case retrieves US BSA guidance because the semantic embedding is close enough. Not just wrong — regulatorily wrong.
  2. Document-type confusion. A "passport" question retrieves "driver's license" guidance. The embeddings often don't discriminate these finely enough in general-purpose models.
  3. Risk-band mixing. A high-risk case retrieves low-risk guidance because the low-risk corpus is much bigger.

Metadata filtering solves all three at the index level.

Index-level implementation shapes

Context-aware retrieval requires the vector store to support hybrid (metadata + vector) queries. Most modern vector stores do:

  • OpenSearch Serverless: k-NN query + filter clause on indexed fields (what the KYC architecture uses).
  • systems/s3-vectors: filter-on-metadata + vector query in the same API call.
  • Pinecone: metadata filter + top-k; extensively used.
  • pgvector: SQL WHERE clause + ORDER BY embedding distance.

Two common hazards:

  • Metadata normalisation. If the document ingestion doesn't normalise jurisdiction codes (EU vs "European Union" vs DE), the filter misses documents. Ingestion-time normalisation matters more than retrieval-time query normalisation.
  • Filter selectivity. Over-specific filters (jurisdiction = EU AND document_type = passport AND risk_level = medium) can return zero chunks. The agent needs a fallback policy — progressive relaxation, or explicit "no matching regulation" signal.

Relation to other RAG variants

Caveats

  • Only the axes, not the filter syntax, are disclosed. The KYC post names jurisdiction / document-type / risk-level as enrichment axes but doesn't show the actual OpenSearch query.
  • The post conflates context-aware retrieval with grounding. The cited passage is paired with "this continuous knowledge access keeps agent decisions grounded in institutional knowledge rather than hallucinating responses" — but hallucination-prevention is the outcome of grounded RAG broadly, not specifically of metadata-enriched retrieval.

Seen in

Last updated · 476 distilled / 1,218 read