Skip to content

SYSTEM Cited by 1 source

Trino

Definition

Trino is an open-source distributed SQL query engine — "a fork of PrestoSQL" (sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway) — that executes federated SQL queries over heterogeneous data sources (object stores, relational DBs, Kafka, Cassandra, etc.) without requiring data relocation. It is the post-Presto-schism successor of the original Presto engine and one of the canonical query engines for lakehouse-style architectures over Apache Iceberg / Hive / Delta Lake tables on object storage.

Role in this wiki

Trino appears as the distributed SQL engine companies put in front of their data lake. Typical deployment shape:

  • One or more Trino coordinators plan queries; many workers execute them; a discovery service tracks cluster membership.
  • Connectors abstract each data source; a single query can join across Iceberg / Hive / Kafka / Postgres in one plan.
  • At organizational scale, a fleet of Trino clusters is operated — typically segregated by workload shape (patterns/workload-segregated-clusters) — rather than one big cluster handling every workload.

Deployment pattern at scale (Expedia)

Expedia runs multiple Trino clusters categorized by workload shape:

  • Adhoc clusters — mixed workloads, medium concurrency; for exploratory analysis and development.
  • ETL clusters — high-volume, high-complexity queries, low concurrency; heavy data processing.
  • BI clusters — low-complexity queries, high concurrency; dashboards behind Tableau / Looker.

Each cluster's config is tuned to the shape of its workload; users do not address clusters directly — they point at a Trino Gateway which routes each query to the appropriate cluster based on routing rules.

Routing / observability context on the gateway side

Trino exposes a handful of query-text properties the gateway can inspect:

  • trinoQueryProperties.getTables() — tables referenced in the query (used for table-based routing to "large-table" clusters).
  • trinoQueryProperties.getBody() — raw query text (used for metadata-query detection like select version()).
  • X-Trino-Source HTTP header — identifies the client application (Tableau, Looker, etc.); drives BI-source routing.

(Source: sources/2026-03-24-expedia-operating-trino-at-scale-with-trino-gateway)

Seen in

Last updated · 200 distilled / 1,178 read