SYSTEM Cited by 2 sources
Amazon Athena¶
Amazon Athena is AWS's serverless SQL query engine over systems/aws-s3 — a managed Presto/Trino deployment that reads systems/apache-parquet (and other open formats) directly from S3 with no cluster to provision. Canonical example of the concepts/compute-storage-separation pattern at the engine level: the storage is S3; the compute spins up on demand per query.
Role for this wiki¶
Athena is one of the interchangeable SQL compute engines over the shared S3 data lake, alongside systems/amazon-redshift, AWS Glue, systems/apache-spark on systems/amazon-emr, and Apache Hive. It is commonly used as the ad-hoc engine when a dedicated warehouse is overkill.
Seen in¶
- sources/2024-07-29-aws-amazons-exabyte-scale-migration-from-apache-spark-to-ray-on-ec2 — one of the compute frameworks available to Amazon BDT table subscribers; and one of the three query engines used in BDT's Data Reconciliation Service for the Ray migration (Spark + Redshift + Athena), to verify that Spark-compacted vs Ray-compacted tables produced equivalent results across multiple frameworks.
- sources/2026-04-01-aws-automate-safety-monitoring-with-computer-vision-and-generative-ai — canonical wiki reference for patterns/data-driven-annotation-curation at ML-ops scale. Athena queries inference-results + customer- feedback data in S3 to aggregate false-positive rates across camera types + deployment conditions, prioritising retraining on image sources with elevated error rates. Also surfaces below-confidence-threshold inferences for targeted annotation. Replaces untenable blanket per-site daily annotation jobs at hundreds of sites.
Related¶
- systems/amazon-redshift — the dedicated-cluster warehouse peer.
- systems/aws-s3 — the shared storage layer.
- systems/apache-iceberg — Athena speaks the Iceberg REST catalog API directly.
- concepts/compute-storage-separation — Athena is the canonical serverless form of this pattern on AWS.