SYSTEM Cited by 5 sources
Presto¶
Definition¶
Presto is the original distributed SQL query engine from Facebook (open-sourced 2013). After the 2020 governance schism it split into two forks — PrestoDB (under the Linux Foundation / Presto Foundation) and PrestoSQL — with the latter renamed Trino in December 2020.
Role in this wiki¶
Presto appears primarily as the historical predecessor of Trino. Named explicitly in sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway: "Trino — a fork of PrestoSQL — is a powerful tool in modern data analytics."
Its other relevance here is as the ancestor of the gateway pattern: Lyft built Presto Gateway as a proxy and load balancer for PrestoDB, and that gateway was later forked and integrated into the Trino ecosystem as Trino Gateway. The cluster-segregation + workload-aware-routing architecture Expedia describes behind Trino Gateway therefore predates the Trino rename — it is a Presto-era pattern that the Trino ecosystem inherited.
Seen in¶
-
sources/2023-07-16-highscalability-lessons-learned-running-presto-at-meta-scale — Meta-authored operational retrospective. Confirms Presto is still actively operated at Meta-scale in 2023 across "tens of thousands of machines" spread over multiple regions, with "every single Presto query at Meta" routed through Meta's Presto Gateway. The "legacy" framing in this wiki therefore applies to the open-source PrestoDB/PrestoSQL split, not to Meta's internal deployment. Contributes: the dual canary+shadow release pipeline, per-host failure attribution + auto-drain, end-to-end cluster lifecycle automation, and Gateway admission control + Gateway autoscaling — all from running Presto at Meta-scale.
-
sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway — named as the original engine Trino forked from, and as the PrestoDB engine Lyft's Presto Gateway was built for. The gateway concept predates the Trino rename.
-
sources/2024-08-31-meta-enforces-purpose-limitation-via-privacy-aware-infrastructure — Presto is explicitly named as a Policy Zones integration target on the batch-processing axis (alongside Spark): "Batch-processing systems that process data rows in batch (mainly via SQL). Examples include real-time and data warehouse systems that power Meta's AI and analytics workloads." A Policy Zone is created per SQL job; annotations are evaluated at table / column / row granularity. This is the first wiki disclosure of Meta's privacy-enforcement integration inside the Presto runtime itself (distinct from the earlier 2023-07-16 operational retrospective which focused on the Gateway + cluster-lifecycle layer).
Known large-scale deployments¶
- Meta (internal) — "tens of thousands of machines", every query routed through an internal Gateway; canary + shadow cluster deployment pipeline; automated cluster standup/decommission wired into the data-warehouse hardware pipeline. See sources/2023-07-16-highscalability-lessons-learned-running-presto-at-meta-scale.
- Lyft (historical) — built the original open-source Presto Gateway that was later renamed Trino Gateway.
- Pinterest (internal) — named as a query substrate for
Piqama's auto-rightsizing service:
"a separate auto-rightsizing service to continuously consume
historical data from various sources, including Presto, Iceberg,
and user-defined data sources." Canonical wiki instance of Presto
used as the analytics front-end for platform-telemetry feedback
loops (historical-
usage auto-rightsizing) over Iceberg
on S3. Also the execution engine underneath
the Analytics Agent — the
agent's four-layer architecture pushes LLM-generated SQL through
Presto with
EXPLAIN- before-EXECUTEvalidation + bounded retry + defaultLIMIT 100.
Seen in (additional)¶
- sources/2026-02-24-pinterest-piqama-pinterest-quota-management-ecosystem — Pinterest's Piqama quota-management platform uses Presto as one of the query substrates for its auto-rightsizing service, reading pre-aggregated quota-enforcement + usage statistics from Apache Iceberg on S3 to recompute future quota values. Canonical wiki instance of Presto-as-analytics-engine for a quota / governance platform's feedback loop.
- sources/2026-03-06-pinterest-unified-context-intent-embeddings-for-scalable-text-to-sql
— Presto as the execution engine for the
Pinterest Analytics Agent.
The agent's four-layer architecture uses Presto with
EXPLAIN-before-EXECUTEvalidation + bounded retry on failure + defaultLIMIT 100on LLM-generated SQL. Canonical wiki instance of Presto as the validated execution surface for an LLM-driven Text-to-SQL agent at production scale (2,500+ analysts).
Related¶
- systems/trino — the direct post-rename successor.
- systems/trino-gateway — descended from Lyft's Presto Gateway.
- systems/meta-presto-gateway — Meta's distinct internal Gateway (not derived from the Lyft/Trino gateway codebase) that fronts every Presto query at Meta.
- systems/meta-data-warehouse — the multi-datacenter data lakehouse Meta's Presto fleet serves.
- systems/pinterest-piqama — Pinterest's quota platform using Presto to query Iceberg-stored telemetry for auto-rightsizing.
- systems/apache-iceberg — common telemetry / lakehouse substrate Presto queries in these platform-governance flows.
- companies/meta — the operator of the largest known Presto deployment.
- companies/pinterest