Skip to content

SYSTEM Cited by 1 source

S3 Inventory

S3 Inventory is AWS's managed daily / weekly object-listing report for an S3 bucket or a filter within one. Delivered to a destination bucket as CSV, Parquet, or ORC, each row describes one object (key, version, size, storage class, last-modified, …). It is the canonical way to get a consistent point-in-time listing of a bucket without issuing millions of LIST requests.

Role for this wiki

S3 Inventory shows up in two shapes in the corpus:

  1. Input to S3 Batch Operations — the native manifest format for batch jobs on every object in a bucket.
  2. Join-side for access-based retention (patterns/s3-access-based-retention) — join against S3 server access logs to compute the set of unused prefixes over a rolling window. The inventory enumerates "what exists"; SAL enumerates "what was accessed"; the disjunction is "what's unused and safe to delete."

Seen in

  • sources/2025-09-26-yelp-s3-server-access-logs-at-scale — Yelp's weekly access-based table joins S3 Inventory with a week of SAL to compute unused prefixes. Prefix extraction is explicit about handling trailing slashes — "removing trailing slash because we wanted to avoid confusion where a prefix '/foo' would determine whether a key '/foo/' was accessed or not." Inventory is also the source that translates accessed-prefixes back to full S3 object names for the Batch Operations manifest.
Last updated · 476 distilled / 1,218 read