Skip to content

PATTERN Cited by 1 source

Projection partitioning over managed partitions

Problem

In Athena / Glue, partitioned tables can track their partitions in two ways:

  • Managed partitions — each partition is registered in the Glue metastore via MSCK REPAIR, ALTER TABLE ADD PARTITION, or a Glue Crawler.
  • Partition projectiondeclare the partition layout in TBLPROPERTIES; Athena enumerates the partitions at query time.

For tables with a known prefix template and a steady stream of new partitions (e.g. one new partition per day per bucket), managed partitions impose two ongoing costs:

  1. Refresh churn — every new partition requires a metastore update (MSCK REPAIR or ALTER TABLE).
  2. Planning latency — Athena's query planner consults the Glue metastore to find the matching partitions; lookup cost grows with partition count.

Yelp (2025-09-26):

"We chose Glue's projection partitioning over managed partitions because it requires refreshing partitions (using commands like MSCK REPAIR or ALTER TABLE), which can become cumbersome as the number of partitions grows. This can lead to increased query planning times due to metastore lookups, and will need to be addressed with partitioning indexing." (Source: sources/2025-09-26-yelp-s3-server-access-logs-at-scale)

Pattern

Default to partition projection when the S3 prefix layout is known and deterministic. Reserve managed partitions for cases where the partition layout is ad-hoc or unpredictable.

When projection wins

  • Known template. Prefixes like s3://.../<account>/<region>/<bucket>/<yyyy>/<MM>/<dd>/ can be enumerated deterministically.
  • Steady addition. New partitions land at a predictable cadence (daily / hourly / per-event). Projection just knows they're there; no registration step.
  • High partition count. Projection's per-query enumeration cost is constant per partition matched by WHERE; metastore lookup cost scales with total partition count.

When managed partitions win

  • Ad-hoc / exotic layouts. Prefixes that don't fit a template.
  • Backfill of historical data where some partitions genuinely don't exist. Projection will enumerate paths that may return empty.
  • You need Glue Crawler's schema-inference on heterogeneous data.

Trade-off table

Axis Partition projection Managed partitions
New-partition visibility immediate, on-prefix-match requires refresh
Query-planning cost template expansion; fast metastore lookup (scales with partition count)
Operational overhead none after initial TBLPROPERTIES MSCK REPAIR / ALTER TABLE / Crawler
Failure mode enum 1M-partition cap if unconstrained; empty-partition enumeration refresh drift; partition-index maintenance
Handles heterogeneous / undeclared prefixes no yes

Composition

  • enum partition-projection type fits low-to-mid cardinality keys where the set is enumerable (bucket names, region list, account IDs). Yelp keeps the list fresh via a Lambda + SQS + EventBridge loop.
  • injected partition-projection type fits impractical-to-enumerate keys; caller must constrain in WHERE. Yelp's access-based tables use this.
  • date partition-projection type fits time-series partitions with a declared range.

See concepts/partition-projection for the full type axis.

Anti-pattern — leaving managed partitions in place and

pairing with partition indexes

Partition indexes help managed partitions' planning-latency problem, but they're a mitigation for a symptom. If the underlying prefix is known and deterministic, partition projection is the root-cause fix — zero refresh cost and constant-per-matched-partition planning cost.

Seen in

  • sources/2025-09-26-yelp-s3-server-access-logs-at-scale — canonical wiki instance. Yelp explicitly names the choice at fleet scale: projection was chosen to sidestep MSCK REPAIR / ALTER TABLE churn and query-planning latency. Composed with enum for bucket_name (kept in sync by a Lambda) and date-granular projection on timestamp.
Last updated · 476 distilled / 1,218 read