Skip to content

SYSTEM Cited by 3 sources

Amazon S3 Vectors

Amazon S3 Vectors is a first-class S3 data primitive for storing and querying vector similarity indices as native S3 resources. Announced 2025-07-16 in preview (Channy Yun, AWS News Blog) in five regions (IAD, CMH, PDX, FRA, SYD), it is the second new data primitive added to S3 in the 2024–2026 platform-expansion arc (after S3 Tables in re:Invent 2024 and before S3 Files 2026-04-07). Claims up to 90% total-cost-of-ownership reduction vs. running vectors on DRAM/SSD vector-database clusters, at subsecond query performance.

(Primary source: sources/2025-07-16-aws-amazon-s3-vectors-preview-launch)

Resource model

vector bucket  ──┬─ vector index (e.g. "movies")
                 │     ├─ vector  {key, float32[dim], metadata={...}}
                 │     ├─ vector  ...
                 │     └─ ...  (tens of millions / index)
                 └─ vector index ...  (up to 10,000 / bucket)

Launch-time limits:

Dimension Value
Vector indexes per bucket up to 10,000
Vectors per index tens of millions
Dimensionality fixed per index, set at create time
Vector element type float32
Distance metric (per index) Cosine or Euclidean
Metadata key-value pairs per vector, queryable as filter
Encryption SSE-S3 (default) or SSE-KMS (systems/aws-kms)

(Source: sources/2025-07-16-aws-amazon-s3-vectors-preview-launch)

API surface

A new top-level s3vectors AWS client (separate from the classic s3 client). Worked example from the launch post:

s3vectors = boto3.client("s3vectors", region_name="us-west-2")

s3vectors.put_vectors(
    vectorBucketName="channy-vector-bucket",
    indexName="channy-vector-index",
    vectors=[{"key": "v1",
              "data": {"float32": embeddings[0]},
              "metadata": {"id": "key1", "source_text": texts[0],
                           "genre": "scifi"}}, ...],
)

query = s3vectors.query_vectors(
    vectorBucketName="channy-vector-bucket",
    indexName="channy-vector-index",
    queryVector={"float32": embedding},
    topK=3,
    filter={"genre": "scifi"},
    returnDistance=True,
    returnMetadata=True,
)

Control-plane (buckets, indexes) and data-plane (put_vectors, query_vectors, list, delete) are split — same shape as the classic object API. See concepts/control-plane-data-plane-separation.

A CLI wrapper s3vectors-embed-cli ships alongside that embeds-via-Bedrock-and-stores-into-an-index in a single command.

Why it exists — Warfield's framing (2026-04-07)

From the later s3-files post:

"Powerful vector databases already existed, and vectors had been quickly working their way in as a feature on existing databases like Postgres. But these systems stored indexes in memory or on SSD, running as compute clusters with live indices. That's the right model for a continuous low-latency search facility, but it's less helpful if you're coming to your data from a storage perspective. Customers were finding that, especially over text-based data like code or PDFs, that the vectors themselves were often more bytes than the data being indexed, stored on media many times more expensive."

Two friction points S3 Vectors addresses:

  1. Cost asymmetry — vectors can exceed indexed data in byte volume; DRAM/SSD cluster storage is expensive relative to S3-tier storage cost for archival / low-QPS workloads.
  2. Operational weight — vector DBs require provisioning, cluster sizing, scaling ops. For "index once, query occasionally, let it grow elastically" workloads this is misfit overhead.

Design anchor

From the 2026-04-07 post:

"S3 Vectors takes a very S3 spin on storing vectors in that its design anchors on a performance, cost and durability profile that is very similar to S3 objects."

  • Cost profile — S3-object-like per-GB, not DRAM/SSD cluster per-GB.
  • Durability profile — same design-centre durability as S3 objects.
  • Always-available HTTP API — similarity search is an API call, not a managed-compute cluster a customer provisions.

Elasticity

"S3 Vectors is designed to be fully elastic, meaning that you can quickly create an index with only a few hundred records in it, and scale over time to billions of records."

No upfront index-size or QPS-tier choice. This is the S3 concepts/elasticity property applied to a new data type. The post states tens of millions/index at preview as a concrete launch ceiling; the "billions" framing is the longer-term design target from the subsequent 2026-04-07 commentary.

Managed optimization

"As you write, update, and delete vectors over time, S3 Vectors automatically optimizes the vector data to achieve the best possible price-performance for vector storage, even as the datasets scale and evolve."

Same managed-maintenance framing as systems/s3-tables: S3 owns the compaction / re-indexing / tiering work. Internal mechanism (HNSW / IVF / disk-based / hybrid, compaction triggers) not disclosed at preview.

Integrations (launched alongside)

1. Amazon Bedrock Knowledge Bases (RAG vector store)

In the Bedrock Knowledge Base creation flow (Step 3, Vector store creation method), users can select an S3 vector bucket as the vector store — either create a new one or reuse an existing one. Same flow is exposed inside SageMaker Unified Studio when building chat-agent apps on Bedrock.

"You can use S3 Vectors in Amazon Bedrock Knowledge Bases to simplify and reduce the cost of vector storage for RAG applications."

2. Export to OpenSearch (cold → hot tiering)

From the S3 console: Advanced search export → Export to OpenSearch on a vector index. Lands on the OpenSearch Service Integration console with a pre-selected S3 source + service role. Creates a new OpenSearch Serverless collection with a k-NN index and migrates the data.

Stated use case:

"You can balance cost and performance by adopting a tiered strategy that stores long-term vector data cost-effectively in Amazon S3 while exporting high priority vectors to OpenSearch for real-time query performance. ... OpenSearch's high performance (high QPS, low latency) for critical, real-time applications, such as product recommendations or fraud detection, while keeping less time-sensitive data in S3 Vectors."

Canonical instance of patterns/cold-to-hot-vector-tiering and the broader concepts/hybrid-vector-tiering concept.

3. Embedding generation via Bedrock

S3 Vectors does not generate embeddings itself. The paved path is Amazon Titan Text Embeddings V2 (amazon.titan-embed-text-v2:0) or other Bedrock-hosted models:

bedrock = boto3.client("bedrock-runtime", region_name="us-west-2")
response = bedrock.invoke_model(
    modelId="amazon.titan-embed-text-v2:0",
    body=json.dumps({"inputText": text}),
)
embedding = json.loads(response["body"].read())["embedding"]

Place in the S3 multi-primitive lineage

Primitive Launch Purpose
Objects 2006 Byte blobs; immutable; 4-verb HTTP API
Tables (systems/s3-tables) re:Invent 2024 Managed Apache Iceberg; compaction/GC as S3 operations
Vectors (this page) Preview 2025-07-16 Elastic similarity-search indices
Files (systems/s3-files) 2026-04-07 NFS mount over S3 data

Each primitive hits S3's baseline properties — elasticity, durability, availability, performance, security — in the presentation best-suited to its workload. See systems/aws-s3 for the overall thesis. Instance of patterns/presentation-layer-over-storage.

Intended workloads (named in the post)

  • Semantic search over images, videos, documents, audio.
  • RAG (retrieval-augmented generation) via Bedrock Knowledge Bases.
  • Agent memory (first-class named use case).
  • Personalized recommendations, automated content analysis, intelligent document processing at scale.

Caveats

  • Preview, not GA (as of 2025-07-16).
  • Internal architecture undisclosed — index type (HNSW / IVF / hybrid / disk-based), compaction policy, recall/QPS trade-offs not stated.
  • Latency claim is qualitative — "subsecond"; no p50/p99/QPS numbers.
  • Metric support at launch — only Cosine and Euclidean; no dot-product, no Hamming.
  • Filter-with-ANN semantics (pre-filter vs post-filter) not documented in the post.
  • No competitive comparison vs pgvector / Pinecone / Weaviate / Qdrant / OpenSearch k-NN on recall or throughput.
  • Dimensionality ceiling / max K not stated.

Seen in

  • sources/2025-07-16-aws-amazon-s3-vectors-preview-launch — primary-source launch announcement (Channy Yun, AWS News Blog). Resource model, API surface, 10K-indexes / tens-of-millions-vectors capacity, Cosine/Euclidean distance, SSE-S3/KMS encryption, Bedrock Knowledge Bases integration, export-to-OpenSearch flow, Titan V2 embedding example, preview regions, up to 90% TCO reduction claim, s3vectors-embed-cli all come from this post.
  • sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3 — retroactively situates S3 Vectors in the multi-primitive expansion and contributes the why-it-exists framing (vector-storage cost asymmetry, "fully elastic few-hundred → billions"). Design-anchor "performance-cost-durability profile like S3 objects" quote is from this post.
  • sources/2025-12-11-aws-architecting-conversational-observability-for-cloud-applicationsS3 Vectors as cold-tier vector store in the Strands agentic deployment of AWS's EKS troubleshooting blueprint. 1024-dimensional embeddings of operational telemetry; chosen over OpenSearch Serverless explicitly "providing cost- optimized vector storage for AI agents." Canonical wiki reference for S3 Vectors as the cheap-tier substrate in a telemetry-to-RAG pipeline.
Last updated · 200 distilled / 1,178 read