PATTERN Cited by 1 source
S3 access-based retention¶
Problem¶
A multi-tenant S3 bucket (or a bucket of complex systems) accumulates objects over years. Some are actively read; many are not. You want to delete the unused objects to reduce storage cost, but:
- A blanket TTL-based lifecycle policy is too broad — it deletes both active and inactive data.
- Per-tenant lifecycle policies are impractical at large tenant counts.
- You don't have an application-level last-accessed timestamp (or you'd need to invasively instrument every code path that reads the bucket).
The AWS-native last-accessed signal is S3 Server Access Logs, which record every object access — but SAL is best-effort delivery (concepts/best-effort-log-delivery), so deletion decisions based on SAL need to tolerate some missing log lines.
Pattern¶
Join S3 Inventory ("what exists") with a rolling window of SAL ("what was accessed") at prefix granularity, compute the disjunction (the unused prefixes), and delete their constituent objects via tag-based lifecycle expiration.
The prefix granularity is the load-bearing choice that makes best-effort SAL tolerable: even if individual log lines are missing for some objects, the prefix as a whole will almost certainly have enough log lines to signal active use.
Canonical shape (Yelp's 2025-09-26 disclosure)¶
1. Extract prefix from each key¶
"The prefix covers only immediate keys under it, segmented by
slash (/), and removing trailing slash because we wanted to
avoid confusion where a prefix /foo would determine whether a
key /foo/ was accessed or not." (Source:
sources/2025-09-26-yelp-s3-server-access-logs-at-scale)
-- Extract prefix from key:
array_join(
slice(
split(rtrim(key, '/'), '/'),
1,
cardinality(split(rtrim(key, '/'), '/')) - 1
),
'/'
) AS prefix
2. Weekly access-based table build¶
Aggregate SAL over a week; compute per-prefix last-access timestamp; write to an "access-based table."
3. Equality join, not LIKE¶
Yelp's dramatic perf datapoint:
"In a query over ~70,000 rows, simply switching from a LIKE operator to an equality (=) operator reduced execution time from over 5 minutes to just 2 seconds—a dramatic improvement."
Why: Athena's distributed planner routes equality joins to keyed-hash-distribution (work spread evenly across nodes). LIKE forces cross-join (broadcast data to every node).
4. Rejoin inventory to translate prefix → key list¶
Tag-based deletion needs the full object key, not just the
prefix. Join the access-based table back to S3 Inventory to
expand prefix → list of full keys for the S3 Batch
Operations manifest.
5. Tag + lifecycle expire¶
Feed the key list into patterns/object-tagging-for-lifecycle-expiration.
Why prefix granularity tolerates SAL's best-effort delivery¶
Yelp's verbatim framing:
"We are comfortable relying on best-effort delivery of S3 server access logs when deleting unused objects, since our retention periods are much longer than the maximum log delay. In addition, deletions are based on prefixes—so missing all logs for a given prefix would only occur for truly inactive data."
Translation:
- SAL can miss individual log lines (< 0.001% > 2 days late at Yelp).
- If a prefix has any activity, the expected number of log lines is >>1 per week.
- P(all log lines for an active prefix are missed) ≈ P(SAL miss rate)^(log lines per week) — vanishingly small for anything actually in use.
- The only prefixes that end up "invisible" to access-based retention are prefixes with genuinely zero access.
Storage tiers' interaction¶
This pattern works per storage class. For Glacier / Deep Archive objects, the access event is recorded (SAL includes restore requests) but the actual access cost and retention semantics differ; accessed-in-Glacier still should block deletion of the object.
Known exemptions (from Yelp)¶
- Backup buckets with CDC targets — an object that hasn't been accessed for months may simply mean the table hasn't changed for months. Looks-unused but isn't deletable. Yelp flags this as a future exemption axis.
- Legally-held / compliance-held data — retention floor is regulatory, independent of access.
- Cold-read safeties (yearly / quarterly batch restores of old data) — the lookback window needs to exceed the longest legitimate idle period.
Cost comparison to alternatives¶
| Approach | Delete granularity | Storage cost | Safety |
|---|---|---|---|
| Blanket TTL lifecycle | object | lowest | unsafe: deletes active data |
| Per-application TTL | application-defined | moderate (coverage gaps) | moderate |
| Access-based retention (this pattern) | prefix, driven by real access | low | high — tolerates SAL best-effort via prefix granularity |
| Manual audit + delete | arbitrary | low | high but O(human time) |
Seen in¶
- sources/2025-09-26-yelp-s3-server-access-logs-at-scale — canonical wiki instance. Yelp's weekly access-based table joins S3 Inventory with a week of SAL at prefix granularity; equality-join (not LIKE) is load-bearing for tractable performance; prefix granularity is what makes best-effort SAL delivery acceptable as the access signal. Composes with patterns/object-tagging-for-lifecycle-expiration for the deletion step.
Related¶
- systems/aws-s3 — the storage being retained.
- systems/s3-inventory — the "what exists" side of the join.
- systems/amazon-athena — the join engine.
- systems/yelp-s3-sal-pipeline — the canonical wiki worked example.
- concepts/s3-server-access-logs — the access-signal source.
- concepts/best-effort-log-delivery — why prefix granularity is necessary.
- patterns/object-tagging-for-lifecycle-expiration — the deletion mechanism this retention scheme drives.