SYSTEM Cited by 1 source
Trino¶
Definition¶
Trino is an open-source distributed SQL query engine — "a fork of PrestoSQL" (sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway) — that executes federated SQL queries over heterogeneous data sources (object stores, relational DBs, Kafka, Cassandra, etc.) without requiring data relocation. It is the post-Presto-schism successor of the original Presto engine and one of the canonical query engines for lakehouse-style architectures over Apache Iceberg / Hive / Delta Lake tables on object storage.
Role in this wiki¶
Trino appears as the distributed SQL engine companies put in front of their data lake. Typical deployment shape:
- One or more Trino coordinators plan queries; many workers execute them; a discovery service tracks cluster membership.
- Connectors abstract each data source; a single query can join across Iceberg / Hive / Kafka / Postgres in one plan.
- At organizational scale, a fleet of Trino clusters is operated — typically segregated by workload shape (patterns/workload-segregated-clusters) — rather than one big cluster handling every workload.
Deployment pattern at scale (Expedia)¶
Expedia runs multiple Trino clusters categorized by workload shape:
- Adhoc clusters — mixed workloads, medium concurrency; for exploratory analysis and development.
- ETL clusters — high-volume, high-complexity queries, low concurrency; heavy data processing.
- BI clusters — low-complexity queries, high concurrency; dashboards behind Tableau / Looker.
Each cluster's config is tuned to the shape of its workload; users do not address clusters directly — they point at a Trino Gateway which routes each query to the appropriate cluster based on routing rules.
Routing / observability context on the gateway side¶
Trino exposes a handful of query-text properties the gateway can inspect:
trinoQueryProperties.getTables()— tables referenced in the query (used for table-based routing to "large-table" clusters).trinoQueryProperties.getBody()— raw query text (used for metadata-query detection likeselect version()).X-Trino-SourceHTTP header — identifies the client application (Tableau, Looker, etc.); drives BI-source routing.
(Source: sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway)
Seen in¶
- sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway — Expedia on running a multi-cluster Trino fleet behind a Trino Gateway with workload-aware routing; Adhoc / ETL / BI cluster segregation; UI contributions for routing-rule management, query history, and cluster health.
Related¶
- systems/presto — the predecessor engine Trino forked from.
- systems/trino-gateway — the proxy / load balancer in front of a Trino fleet.
- systems/apache-iceberg — a canonical table format Trino queries.
- systems/apache-hive — the legacy catalog protocol (Hive Metastore) Trino commonly federates with.
- systems/amazon-athena — AWS's managed serverless Presto/Trino offering.
- concepts/workload-aware-routing — the architectural pattern for routing SQL queries based on their shape, realised in Trino fleets via Trino Gateway.