PATTERN Cited by 1 source

Cold-to-Hot Vector Tiering¶

Cold-to-hot vector tiering is the operational pattern of storing the full vector corpus in a cheap, storage-optimized index (the cold tier) and selectively promoting a subset — the current working / high-QPS set — into a DRAM/SSD real-time ANN index (the hot tier) on demand.

The canonical instance (AWS, 2025-07-16 preview launch): export an S3 Vectors index → OpenSearch Serverless k-NN collection via a console action.

(Source: sources/2025-07-16-aws-amazon-s3-vectors-preview-launch)

Mechanics (AWS console flow)¶

Vectors are ingested and stored in an S3 Vectors index. Storage cost is S3-tier. Queries run with subsecond latency on demand.
When a subset of vectors becomes hot (e.g. the catalog active this season, recent fraud patterns, this month's users), pick Advanced search export → Export to OpenSearch on the vector index in the S3 console.
Lands on the OpenSearch Service Integration console with the S3 vector source pre-selected and a service access role auto-suggested.
Choose Export. A new OpenSearch Serverless collection is created and a k-NN index is populated with a copy of the vector data from S3.
Monitor progress in the Import history pane. Once status = Complete, query the new OpenSearch k-NN index directly for hot workloads.

AWS's articulation:

"OpenSearch's high performance (high QPS, low latency) for critical, real-time applications, such as product recommendations or fraud detection, while keeping less time-sensitive data in S3 Vectors."

When to apply it¶

Use this pattern when	Don't bother when
Cost/GB of your vector store dominates spend	Corpus is small enough to fit DRAM cheaply
Working set is a small fraction of the full corpus	Entire corpus is queried at high QPS
Queries are a mix of archival "search my history" and real-time	Latency-insensitive workload
You want to re-tier subsets over time (seasonal catalogs, etc.)	Single workload, stable access pattern

Why it's a pattern, not just a feature¶

The shape — bulk-cold + hot-copy, with a managed export path — applies beyond AWS. Any vector DB that can ingest from S3 can participate in the cold side; any hot ANN engine (pgvector on Postgres, Pinecone, Weaviate, Qdrant, Elasticsearch) can be a destination. The AWS implementation happens to be a first-party one-click flow, but the architectural shape is portable.

Trade-offs¶

Data freshness — hot copy is a snapshot. Keeping it in sync with cold requires either periodic re-export or a change-data flow (not in the preview-launch scope).
Metadata drift — exported vectors bring metadata, but updates to metadata on the cold side don't auto-propagate.
Double storage cost for the hot-tier subset (acceptable if the subset is small).
Query-API fragmentation — cold queries use the s3vectors client; hot queries use the OpenSearch k-NN API. Application code has to pick the right tier per query.

Relation to other patterns¶

patterns/presentation-layer-over-storage — broader pattern of layering workload-specific presentations over a common durable store; cold-to-hot vector tiering is an instance (S3 as durable cold, OpenSearch as a derived hot presentation).
concepts/compute-storage-separation — generalized principle this pattern instantiates for vector indices.
concepts/hybrid-vector-tiering — the conceptual split this pattern operationalises.

Seen in¶

sources/2025-07-16-aws-amazon-s3-vectors-preview-launch — launches S3 Vectors with a first-party console flow to export to OpenSearch Serverless k-NN, naming "product recommendations or fraud detection" as the hot use case and "long-term vector data" as the cold.