Skip to content

SYSTEM Cited by 2 sources

Amazon S3 Tables

S3 Tables (GA Dec 2024, re:Invent) is Amazon S3's first-class table resource — a managed Apache Iceberg offering where the table, not the underlying objects, is the primary resource for policy, addressing, and managed operations. It reframes years of customer-managed Iceberg-on-S3 into an S3 primitive.

(Source: sources/2025-03-14-allthingsdistributed-s3-simplicity-is-table-stakes)

Why it exists

Customers were already running multi-petabyte Iceberg databases on S3. Warfield's summary of the pain (2025):

"With Iceberg what they were really doing was building their own table primitive over S3 objects, and they asked us why S3 wasn't able to do more of the work to make that experience simple."

Specific gaps in customer-managed Iceberg-on-S3:

  • Compaction and garbage collection are the customer's job. Snapshot- based updates fragment tables; performance degrades without periodic compaction passes. Customer code owns those jobs.
  • S3 features don't apply cleanly. Intelligent-Tiering and cross-region replication treat each object independently and don't understand Iceberg's logical layout — so they can tier or replicate incorrectly relative to table semantics.
  • Access control was at the object level, not at the logical table; IAM policies had to enumerate object prefixes.

Architectural moves

  1. Table as a first-class resource. Each table "surfaces behind its own endpoint and is a resource from a policy perspective — this makes it much easier to control and share access by setting policy on the table itself and not on the individual objects that it is composed of."
  2. New table-level APIs. Table create + snapshot commit become single S3 calls rather than sequences of object PUTs orchestrated by Iceberg client code.
  3. S3 internalises the Iceberg layout to run compaction, GC, and tiering as managed operations — the same model S3 uses for object placement and storage-class tiering, adapted to the table abstraction.

Post-launch evolution (first 14 weeks)

  • Iceberg REST Catalog (IRC) API support.
  • In-console query for tables.
  • DuckDB Iceberg collaboration highlighted to accelerate cross-engine adoption.

Simplicity ↔ velocity call-out

The 2025 post is unusually frank that Tables is an early point on its own curve: "we knew we were making a simplicity versus velocity decision… we have a lot of simplification and improvement left to do." A live example of the concepts/simplicity-vs-velocity tension.

Design thesis Tables validates

Warfield's explicit claim, drawn from shipping Tables:

"Historically, we've always talked about S3 as an object store and then gone on to talk about all of the properties of objects… security, elasticity, availability, durability, performance… I think one thing that we've learned from the work on Tables is that it's these properties of storage that really define S3 much more than the object API itself."

i.e. tables can be first-class S3 without contradicting what S3 is, because S3 is defined by its storage properties, not by the object verb set. This reframes the whole system — see systems/aws-s3.

Open questions / caveats

  • Iceberg is an open table format. S3 Tables' managed semantics could diverge subtly from customer-managed Iceberg over time (compaction policy, snapshot retention, consistency of metadata commits). The 2025 post doesn't address this.
  • No published numbers on table-count scaling, commit latency, or compaction SLOs.
  • Interaction with S3 object-level features (lifecycle, object lock, cross-region replication) on Tables is not spelled out beyond "existing S3 features don't work exactly as expected" on raw Iceberg.

Place in the multi-primitive lineage (2024-2026)

Tables was the first of three new first-class data primitives added to the S3 platform between re:Invent 2024 and 2026-04 — see patterns/presentation-layer-over-storage and systems/aws-s3 for the overall framing:

Primitive Launch Page
Tables re:Invent 2024 this page
Vectors re:Invent 2025 systems/s3-vectors
Files 2026-04-07 systems/s3-files

The 2026-04-07 post (see below) reports over 2 million tables stored in S3 Tables approximately 16 months after launch — retroactive validation of the "table as first-class S3 primitive" architectural bet.

Seen in

Last updated · 200 distilled / 1,178 read