AWS — Introducing Amazon S3 Vectors: First cloud storage with native vector support at scale (preview)
Summary¶
Channy Yun (AWS News Blog, 2025-07-16) announces the preview of
Amazon S3 Vectors: a new first-class S3 data
primitive for storing and querying vector similarity indices as
native S3 resources. This is the primary-source launch post; the
S3 Files / changing face of S3
post (2026-04-07) later situated S3 Vectors in the broader multi-primitive
narrative. S3 Vectors introduces two new resource types — vector
buckets and vector indexes — behind a dedicated s3vectors API,
with SSE-S3 or SSE-KMS encryption, metadata filtering, Cosine
or Euclidean distance, and up to 10,000 indexes/bucket ×
tens-of-millions of vectors/index. Claimed up to 90% TCO reduction
vs. running vectors on DRAM/SSD compute clusters, with subsecond query
performance. Two integrations are launched alongside: Bedrock Knowledge Bases
can select an S3 vector bucket as the vector store for RAG apps
(including inside SageMaker
Unified Studio), and an export-to-OpenSearch pipeline migrates
an S3 vector index into an OpenSearch Serverless k-NN collection for
hot, low-latency workloads — canonical instantiation of
patterns/cold-to-hot-vector-tiering.
Key takeaways¶
- S3 gains a vector primitive. "Amazon S3 Vectors is the first cloud object store with native support to store large vector datasets and provide subsecond query performance" (Channy, 2025-07-16). S3's presentation expands from object-bucket + table-bucket to vector bucket. This is the re:Invent-2025-timeframe addition in the three-primitive-expansion arc Warfield later narrated (Tables 2024 → Vectors 2025 → Files 2026). See systems/aws-s3 "Place in the multi-primitive lineage".
- Resource model: vector bucket → vector index → vectors + metadata.
Each vector bucket holds up to 10,000 vector indexes; each
vector index holds tens of millions of vectors. Each vector
has a key, a
float32array (all vectors in an index must have the same dimensionality), and optional metadata as key-value pairs (dates, categories, user preferences) usable as query filters. "As you write, update, and delete vectors over time, S3 Vectors automatically optimizes the vector data to achieve the best possible price-performance for vector storage" — managed-compaction framing analogous to S3 Tables' Iceberg maintenance. - Distance metric is per-index, cosine or Euclidean. Set at
CreateVectorIndextime. "When creating vector embeddings, select your embedding model's recommended distance metric for more accurate results." No mention of dot-product/Hamming at preview. Pins concepts/vector-similarity-search shape at launch. - No provisioned infra; pay for storage + queries. "A new bucket type with a dedicated set of APIs to store, access, and query vector data without provisioning any infrastructure." Elasticity is the explicit anchor — "can reduce the total cost of uploading, storing, and querying vectors by up to 90 percent ... affordable for businesses to store AI-ready data at massive scale." Extends concepts/elasticity — the S3 property — to vector indices.
- Query API (
s3vectors.query_vectors) returns top-K with filter + distance + metadata. The worked example uses atopK=3query withfilter={"genre":"scifi"},returnDistance=True,returnMetadata=Trueon an index of 3 movie-plot embeddings.put_vectorsinserts batches; keys + float32 data + metadata are set per vector. Sameboto3client pattern as other S3 APIs. - Bedrock Knowledge Bases: S3 Vectors as RAG vector store. "You can use S3 Vectors in Amazon Bedrock Knowledge Bases to simplify and reduce the cost of vector storage for RAG applications." In the Bedrock console's knowledge-base creation flow, Step 3 Vector store creation method lets the user either create a fresh S3 vector bucket+index or reuse an existing one. Integrated similarly inside Amazon SageMaker Unified Studio's Bedrock-knowledge-base components for chat agent apps. This is the "make RAG cheap" play — vectors sit in S3-tier storage, not DRAM/SSD.
- Tiered cold↔hot strategy: export to OpenSearch. "You can balance cost and performance by adopting a tiered strategy that stores long-term vector data cost-effectively in Amazon S3 while exporting high priority vectors to OpenSearch for real-time query performance." Console action: Advanced search export → Export to OpenSearch on a vector index → OpenSearch Service Integration console with pre-selected S3 source → creates a new OpenSearch Serverless collection with a k-NN index and migrates data. The use case: "product recommendations or fraud detection" on hot data, archive-style vectors (e.g. historical) kept in S3 Vectors. Canonical patterns/cold-to-hot-vector-tiering; extends concepts/hybrid-vector-tiering.
- Encryption defaults and options. "If you don't specify an encryption type, Amazon S3 applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for new vectors. You can also choose server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS)." Same two-tier encryption model as object buckets; no Vectors-specific policy surface noted at preview.
- Embedding generation is not in S3 Vectors itself; Bedrock is the
paved path. "To generate vector embeddings for your unstructured
data, you can use embedding models offered by Amazon Bedrock." The
worked example invokes
amazon.titan-embed-text-v2:0(systems/amazon-titan-embeddings) viabedrock.invoke_modelthenput_vectorsinto the index. Separately, thes3vectors-embed-clisingle-command tool wraps "embed with Bedrock + store in index" for scripts. - Preview regions (2025-07-16): US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Frankfurt), Asia Pacific (Sydney). Feedback channel: AWS re:Post for S3.
Architecture / API numbers¶
| Dimension | Launch-time value |
|---|---|
| Capacity — indexes/bucket | up to 10,000 |
| Capacity — vectors/index | tens of millions |
| Distance metrics | Cosine or Euclidean (per-index) |
| Vector data type | float32 array; all vectors in an index share dimensionality |
| Metadata | key-value pairs attached per vector; usable as query filters |
| Query | top-K, optional filter, optional returnDistance, returnMetadata |
| Encryption | SSE-S3 (default) or SSE-KMS |
| Claimed TCO | up to 90% reduction vs. vectors on DRAM/SSD compute clusters |
| Query latency | subsecond (no concrete p50/p99) |
| Preview regions | IAD, CMH, PDX, FRA, SYD |
| Pricing model | no provisioned infra — pay storage + queries (elastic) |
API surface (worked example in the post)¶
import boto3
s3vectors = boto3.client("s3vectors", region_name="us-west-2")
# Insert
s3vectors.put_vectors(
vectorBucketName="channy-vector-bucket",
indexName="channy-vector-index",
vectors=[
{"key": "v1",
"data": {"float32": embeddings[0]},
"metadata": {"id": "key1", "source_text": texts[0], "genre": "scifi"}},
# ...
],
)
# Query (top-K similarity with metadata filter)
query = s3vectors.query_vectors(
vectorBucketName="channy-vector-bucket",
indexName="channy-vector-index",
queryVector={"float32": embedding},
topK=3,
filter={"genre": "scifi"},
returnDistance=True,
returnMetadata=True,
)
Distinct control-plane verbs at preview (inferred from post): create
vector bucket, create vector index, list vectors. Data-plane verbs:
put_vectors, query_vectors, (implied: list, delete). Shape
follows
control-plane / data-plane separation
familiar from the object API.
Intended workloads (as stated in the post)¶
- Semantic search over unstructured data (images, videos, documents, audio).
- Retrieval-Augmented Generation (RAG) — canonical paved path via Bedrock Knowledge Bases.
- Agent memory — "build agent memory" named as a first-class use case alongside semantic search and RAG.
- Personalized recommendations, automated content analysis, intelligent document processing at scale.
Caveats¶
- Preview, not GA. Launch-time regions limited to five (IAD, CMH, PDX, FRA, SYD). Behaviour, limits, and pricing may change before GA.
- No internal architecture disclosed. The post does not describe the index structure (HNSW / IVF / disk-based / hybrid), recall/QPS trade-offs, compaction strategy for vector data, or how "up to 90% TCO reduction" is achieved mechanically. "S3 Vectors automatically optimizes the vector data" is all that is said.
- No concrete latency numbers. "Subsecond query performance" is the only latency claim; no p50 / p99 / QPS ceiling per index.
- No competitive recall/quality comparison. No numbers against pgvector / Pinecone / Weaviate / OpenSearch k-NN / Qdrant on standard benchmarks (SIFT, GloVe, ANN-Benchmarks).
- Filter-with-ANN semantics not discussed (pre-filter vs post-filter — a common recall/latency corner in ANN systems).
- Dimensionality ceiling / max K not stated.
- Post is marketing-leaning — it is a launch announcement on AWS News Blog. Primary architectural signal comes from the API shape, the capacity numbers, and the tiering-to-OpenSearch story; treat TCO and latency claims as vendor-stated.
Related wiki pages updated / created¶
- Created:
sources/2025-07-16-aws-amazon-s3-vectors-preview-launch(this file); concepts/vector-embedding, concepts/vector-similarity-search, concepts/hybrid-vector-tiering; systems/amazon-bedrock-knowledge-bases, systems/amazon-opensearch-service, systems/amazon-titan-embeddings, systems/amazon-sagemaker-unified-studio; patterns/cold-to-hot-vector-tiering. - Updated: systems/s3-vectors (was a stub created from the
2026-04-07 secondary mention; now the primary source), systems/aws-s3,
companies/aws,
index.md,log.md.
Seen in¶
- This post (primary-source launch, 2025-07-16).
- sources/2026-04-07-allthingsdistributed-s3-files-and-the-changing-face-of-s3 — situates S3 Vectors in the multi-primitive expansion (Warfield, 2026-04-07). Cross-references confirm the launch-era framing.