Skip to content

SYSTEM Cited by 1 source

Thanos

Thanos is the CNCF-incubated open-source time-series database project that extends systems/prometheus with long-term storage via object storage, global query federation, and horizontal scalability — addressing the limits a single Prometheus server hits at tens of millions of active series.

Thanos is the upstream project that Databricks forks internally as systems/pantheon.

Core shape

  • Receive — ingests remote-write samples from Prometheus servers (or directly from instrumented applications), keeps recent data in memory, flushes older blocks to disk and eventually to object storage. Organised into Receive groups implemented as StatefulSets; hash-ring-based partitioning across group members for load distribution.
  • Querier — federates PromQL queries across Receive nodes, Store gateways, and local Prometheus replicas. Deduplicates overlapping samples from replicated writes.
  • Store — serves historical blocks out of object storage, letting Querier answer queries over data older than local disk retention.
  • Compactor — downsamples historical blocks (5m / 1h) and compacts overlapping blocks in object storage; lets queries over long ranges stay fast.
  • Ruler — evaluates recording / alerting rules against the Thanos data model.

Tiered storage — the foundational scaling primitive

Thanos' key architectural move over a single Prometheus is its three-tier storage:

  • Memory — most recent samples (hours), served at Prometheus-comparable latency.
  • On-disk — last 24h of blocks, on Receive nodes.
  • Object storage — all older data (S3 / GCS / Azure Blob), served via Store. Decouples compute from storage: a cluster can scale compute up without needing to rebalance historical data.

See concepts/tiered-storage-hot-warm-cold for the generalised pattern.

Multitenancy

Thanos supports multitenancy via tenant-attribution at the router: write requests are annotated with a tenant header, and the router fans out to the correct Receive group. Each tenant's series are logically isolated even when groups are shared at the cluster level.

Edge cases / scaling realities

Thanos scales from small 3-node deployments to fleet-scale multi-hundred-instance deployments — but at the high end, real operators report needing to:

  • Replace the default ("one large hash ring") Receive topology with multiple isolated StatefulSets for operational isolation.
  • Add memory-retention tiering beyond the default single retention window, to keep ephemeral-workload metrics from dominating memory cost.
  • Layer custom control-plane automation on top of vanilla Kubernetes primitives — generic HPA / StatefulSet rolling updates are insufficient for quorum-preserving rollouts. See patterns/purpose-built-control-plane-for-stateful-tsdb.
  • Layer a pre-storage aggregation tier to absorb cardinality growth — see patterns/aggregation-shield-for-tsdb-cardinality.

Databricks' Pantheon (systems/pantheon) is a canonical documented instance of all four of these adaptations layered on top of an upstream Thanos fork.

Seen in

Last updated · 451 distilled / 1,324 read