Skip to content

SYSTEM Cited by 1 source

ClickHouse

ClickHouse is an open-source columnar OLAP database originally from Yandex, widely used for high-volume analytics and telemetry workloads. Clusters are sharded across nodes; cross-shard queries use distributed tables.

Distributed-query model

A ClickHouse cluster of N shards typically exposes two layers:

  • Distributed tables (in a database conventionally named default) — virtual tables backed by the Distributed table engine that fan out queries to the underlying shard tables on each node.
  • Underlying shard tables (in a database conventionally named r0) — where data is physically stored on each shard.

A client queries the distributed table; the coordinator fans out subqueries to r0.* on each shard node; results merge back.

Historically, distributed subqueries ran under a shared system account. A more fine-grained model runs them under the initiating user's account so that per-user resource limits and grants apply correctly — one bad subquery from a user can no longer affect others. Enabling this model requires granting users explicit access on the r0 underlying tables (they already had implicit access through the distributed tables).

Canonical wiki concept: concepts/clickhouse-distributed-query.

System catalog

ClickHouse exposes metadata tables (system.tables, system.columns, etc.) that list tables / columns visible to the querying user. Metadata visibility follows the user's grants:

  • Without explicit r0 grants: system.columns shows only default columns.
  • With explicit r0 grants: system.columns shows both default and r0 columns.

For a given table name (e.g., http_requests_features), the same row appears once per database namespace the user can see. A query that filters by table but not by database will therefore return a different row count depending on the grants in effect.

Seen in

  • sources/2025-11-18-cloudflare-outage-on-november-18-2025 — canonical wiki instance. Cloudflare was mid-rollout of the per-user-account distributed-query model. The new grants made r0.http_requests_features visible via system.columns to users who previously saw only default.http_requests_features. A downstream consumer (Bot Management's feature-file generator) ran SELECT ... FROM system.columns WHERE table = 'http_requests_features' without filtering by database, so the result-row count roughly doubled on migrated nodes. The doubled-size feature file then broke the preallocated 200-feature cap in the FL2 core proxy — triggering a 3-hour core-traffic outage.
Last updated · 200 distilled / 1,178 read