Skip to content

AWS 2025-07-16 Tier 1

Read original ↗

AWS — Introducing Amazon S3 Vectors: First cloud storage with native vector support at scale (preview)

Summary

Channy Yun (AWS News Blog, 2025-07-16) announces the preview of Amazon S3 Vectors: a new first-class S3 data primitive for storing and querying vector similarity indices as native S3 resources. This is the primary-source launch post; the S3 Files / changing face of S3 post (2026-04-07) later situated S3 Vectors in the broader multi-primitive narrative. S3 Vectors introduces two new resource types — vector buckets and vector indexes — behind a dedicated s3vectors API, with SSE-S3 or SSE-KMS encryption, metadata filtering, Cosine or Euclidean distance, and up to 10,000 indexes/bucket × tens-of-millions of vectors/index. Claimed up to 90% TCO reduction vs. running vectors on DRAM/SSD compute clusters, with subsecond query performance. Two integrations are launched alongside: Bedrock Knowledge Bases can select an S3 vector bucket as the vector store for RAG apps (including inside SageMaker Unified Studio), and an export-to-OpenSearch pipeline migrates an S3 vector index into an OpenSearch Serverless k-NN collection for hot, low-latency workloads — canonical instantiation of patterns/cold-to-hot-vector-tiering.

Key takeaways

  1. S3 gains a vector primitive. "Amazon S3 Vectors is the first cloud object store with native support to store large vector datasets and provide subsecond query performance" (Channy, 2025-07-16). S3's presentation expands from object-bucket + table-bucket to vector bucket. This is the re:Invent-2025-timeframe addition in the three-primitive-expansion arc Warfield later narrated (Tables 2024 → Vectors 2025 → Files 2026). See systems/aws-s3 "Place in the multi-primitive lineage".
  2. Resource model: vector bucket → vector index → vectors + metadata. Each vector bucket holds up to 10,000 vector indexes; each vector index holds tens of millions of vectors. Each vector has a key, a float32 array (all vectors in an index must have the same dimensionality), and optional metadata as key-value pairs (dates, categories, user preferences) usable as query filters. "As you write, update, and delete vectors over time, S3 Vectors automatically optimizes the vector data to achieve the best possible price-performance for vector storage" — managed-compaction framing analogous to S3 Tables' Iceberg maintenance.
  3. Distance metric is per-index, cosine or Euclidean. Set at CreateVectorIndex time. "When creating vector embeddings, select your embedding model's recommended distance metric for more accurate results." No mention of dot-product/Hamming at preview. Pins concepts/vector-similarity-search shape at launch.
  4. No provisioned infra; pay for storage + queries. "A new bucket type with a dedicated set of APIs to store, access, and query vector data without provisioning any infrastructure." Elasticity is the explicit anchor — "can reduce the total cost of uploading, storing, and querying vectors by up to 90 percent ... affordable for businesses to store AI-ready data at massive scale." Extends concepts/elasticity — the S3 property — to vector indices.
  5. Query API (s3vectors.query_vectors) returns top-K with filter + distance + metadata. The worked example uses a topK=3 query with filter={"genre":"scifi"}, returnDistance=True, returnMetadata=True on an index of 3 movie-plot embeddings. put_vectors inserts batches; keys + float32 data + metadata are set per vector. Same boto3 client pattern as other S3 APIs.
  6. Bedrock Knowledge Bases: S3 Vectors as RAG vector store. "You can use S3 Vectors in Amazon Bedrock Knowledge Bases to simplify and reduce the cost of vector storage for RAG applications." In the Bedrock console's knowledge-base creation flow, Step 3 Vector store creation method lets the user either create a fresh S3 vector bucket+index or reuse an existing one. Integrated similarly inside Amazon SageMaker Unified Studio's Bedrock-knowledge-base components for chat agent apps. This is the "make RAG cheap" play — vectors sit in S3-tier storage, not DRAM/SSD.
  7. Tiered cold↔hot strategy: export to OpenSearch. "You can balance cost and performance by adopting a tiered strategy that stores long-term vector data cost-effectively in Amazon S3 while exporting high priority vectors to OpenSearch for real-time query performance." Console action: Advanced search export → Export to OpenSearch on a vector index → OpenSearch Service Integration console with pre-selected S3 source → creates a new OpenSearch Serverless collection with a k-NN index and migrates data. The use case: "product recommendations or fraud detection" on hot data, archive-style vectors (e.g. historical) kept in S3 Vectors. Canonical patterns/cold-to-hot-vector-tiering; extends concepts/hybrid-vector-tiering.
  8. Encryption defaults and options. "If you don't specify an encryption type, Amazon S3 applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for new vectors. You can also choose server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS)." Same two-tier encryption model as object buckets; no Vectors-specific policy surface noted at preview.
  9. Embedding generation is not in S3 Vectors itself; Bedrock is the paved path. "To generate vector embeddings for your unstructured data, you can use embedding models offered by Amazon Bedrock." The worked example invokes amazon.titan-embed-text-v2:0 (systems/amazon-titan-embeddings) via bedrock.invoke_model then put_vectors into the index. Separately, the s3vectors-embed-cli single-command tool wraps "embed with Bedrock + store in index" for scripts.
  10. Preview regions (2025-07-16): US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Frankfurt), Asia Pacific (Sydney). Feedback channel: AWS re:Post for S3.

Architecture / API numbers

Dimension Launch-time value
Capacity — indexes/bucket up to 10,000
Capacity — vectors/index tens of millions
Distance metrics Cosine or Euclidean (per-index)
Vector data type float32 array; all vectors in an index share dimensionality
Metadata key-value pairs attached per vector; usable as query filters
Query top-K, optional filter, optional returnDistance, returnMetadata
Encryption SSE-S3 (default) or SSE-KMS
Claimed TCO up to 90% reduction vs. vectors on DRAM/SSD compute clusters
Query latency subsecond (no concrete p50/p99)
Preview regions IAD, CMH, PDX, FRA, SYD
Pricing model no provisioned infra — pay storage + queries (elastic)

API surface (worked example in the post)

import boto3

s3vectors = boto3.client("s3vectors", region_name="us-west-2")

# Insert
s3vectors.put_vectors(
    vectorBucketName="channy-vector-bucket",
    indexName="channy-vector-index",
    vectors=[
        {"key": "v1",
         "data": {"float32": embeddings[0]},
         "metadata": {"id": "key1", "source_text": texts[0], "genre": "scifi"}},
        # ...
    ],
)

# Query (top-K similarity with metadata filter)
query = s3vectors.query_vectors(
    vectorBucketName="channy-vector-bucket",
    indexName="channy-vector-index",
    queryVector={"float32": embedding},
    topK=3,
    filter={"genre": "scifi"},
    returnDistance=True,
    returnMetadata=True,
)

Distinct control-plane verbs at preview (inferred from post): create vector bucket, create vector index, list vectors. Data-plane verbs: put_vectors, query_vectors, (implied: list, delete). Shape follows control-plane / data-plane separation familiar from the object API.

Intended workloads (as stated in the post)

  • Semantic search over unstructured data (images, videos, documents, audio).
  • Retrieval-Augmented Generation (RAG) — canonical paved path via Bedrock Knowledge Bases.
  • Agent memory"build agent memory" named as a first-class use case alongside semantic search and RAG.
  • Personalized recommendations, automated content analysis, intelligent document processing at scale.

Caveats

  • Preview, not GA. Launch-time regions limited to five (IAD, CMH, PDX, FRA, SYD). Behaviour, limits, and pricing may change before GA.
  • No internal architecture disclosed. The post does not describe the index structure (HNSW / IVF / disk-based / hybrid), recall/QPS trade-offs, compaction strategy for vector data, or how "up to 90% TCO reduction" is achieved mechanically. "S3 Vectors automatically optimizes the vector data" is all that is said.
  • No concrete latency numbers. "Subsecond query performance" is the only latency claim; no p50 / p99 / QPS ceiling per index.
  • No competitive recall/quality comparison. No numbers against pgvector / Pinecone / Weaviate / OpenSearch k-NN / Qdrant on standard benchmarks (SIFT, GloVe, ANN-Benchmarks).
  • Filter-with-ANN semantics not discussed (pre-filter vs post-filter — a common recall/latency corner in ANN systems).
  • Dimensionality ceiling / max K not stated.
  • Post is marketing-leaning — it is a launch announcement on AWS News Blog. Primary architectural signal comes from the API shape, the capacity numbers, and the tiering-to-OpenSearch story; treat TCO and latency claims as vendor-stated.

Seen in

Source

Last updated · 200 distilled / 1,178 read