CONCEPT Cited by 1 source

Hybrid Vector Tiering (Cold S3 ↔ Hot OpenSearch)¶

Hybrid vector tiering is the storage-and-query pattern that recognises vector workloads have bimodal access profiles — a large slow-growing archival set where storage cost dominates, and a small high-QPS working set where query latency dominates — and places each in its cost-appropriate tier, with a cheap migration path between them.

(Source: sources/2025-07-16-aws-amazon-s3-vectors-preview-launch)

Why the split exists¶

Different vector workloads exert different pressure on the storage system:

Workload	What matters	Right tier
Semantic search over archive (historical media, slow-growing RAG corpora, agent memory)	Storage cost per vector	S3-tier (e.g. systems/s3-vectors)
Real-time recommendations, fraud detection	QPS + p99 latency	DRAM/SSD-tier (e.g. OpenSearch Serverless k-NN)

AWS's 2025-07-16 launch post gives the canonical articulation:

"You can balance cost and performance by adopting a tiered strategy that stores long-term vector data cost-effectively in Amazon S3 while exporting high priority vectors to OpenSearch for real-time query performance. ... OpenSearch's high performance (high QPS, low latency) for critical, real-time applications, such as product recommendations or fraud detection, while keeping less time-sensitive data in S3 Vectors."

Structure of the tiering¶

                ┌─────────────────────┐
                │  Ingest (Bedrock    │
                │  embedding models)  │
                └──────────┬──────────┘
                           ▼
            ┌──────────────────────────┐
            │   S3 Vectors (cold)      │
            │   storage-optimized      │
            │   tens-of-M to billions  │
            │   subsecond query        │
            └──────────┬───────────────┘
                       │  "Advanced search export →
                       │   Export to OpenSearch"
                       ▼
            ┌──────────────────────────┐
            │  OpenSearch Serverless   │
            │  k-NN (hot)              │
            │  low-latency real-time   │
            │  DRAM/SSD-backed         │
            └──────────────────────────┘

The cold tier is the durability + capacity home for vectors. The hot tier is a selective, derived view copied from the cold tier for workloads that need real-time performance.

Cost asymmetry motivates the pattern¶

Warfield (2026-04-07) names the storage-economics argument explicitly:

"Customers were finding that, especially over text-based data like code or PDFs, that the vectors themselves were often more bytes than the data being indexed, stored on media many times more expensive."

For a static/slow-growing corpus with low QPS, running DRAM/SSD vector clusters pays compute + memory rent for storage you don't need at hot latency. Shipping those vectors to cold storage recovers that budget; the hot tier is then sized only for the working set.

(Source: sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3)

Contrast with single-tier approaches¶

DRAM-only vector DB (historical Pinecone / Weaviate posture): best latency, worst cost/GB. Forces all vectors — active or not — into expensive storage.
Disk-based ANN (pgvector, recent DiskANN variants): cheaper, latency dependent on SSD seek patterns; still runs on provisioned compute clusters.
Storage-first ANN (S3 Vectors): cheapest bulk storage, elastic, "subsecond" but not microsecond; no provisioned cluster.

Hybrid tiering doesn't pick a winner — it uses the right tier per access pattern within a single application.

patterns/cold-to-hot-vector-tiering — the operational pattern (export selected index → managed hot store).
concepts/compute-storage-separation — the broader architectural principle hybrid vector tiering instantiates for vector indices.

Seen in¶

sources/2025-07-16-aws-amazon-s3-vectors-preview-launch — launches S3 Vectors + export-to-OpenSearch flow as an integrated tiering product, naming "product recommendations or fraud detection" as the hot use cases and "long-term vector data" as the cold.
sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3 — contributes the cost-asymmetry framing that motivates the split.