Skip to content

CONCEPT Cited by 1 source

Database permission migration risk

A class of incident where a correct, defensive permission change on a database causes a downstream consumer to break because the consumer had an implicit assumption about what its queries returned under the old permission model.

The failure surfaces at the intersection of two teams' systems — neither owns both sides, so neither can catch it at code-review time.

Mechanism

  • A database migration grants users broader explicit access for least-privilege reasons (per-user resource limits, audit, fine-grained control).
  • Metadata tables (system.columns, information_schema.*, etc.) begin returning more rows for the same query, because visibility tracks grants.
  • A downstream consumer ran a query whose implicit assumption — "this only returns rows from the one database I have access to" — was true pre-migration and false post-migration.
  • Row count changes; downstream logic behaves differently.

Canonical instance

sources/2025-11-18-cloudflare-outage-on-november-18-2025 — Cloudflare's ClickHouse team added explicit r0.* grants for the per-user distributed-query model. The Bot Management team's feature-file generator ran SELECT ... FROM system.columns WHERE table = 'http_requests_features' without a database filter. Post-grant, the query returned rows from both default and r0 — doubling the feature count. The doubled file blew the FL2 proxy's preallocated 200-feature cap. ~3 hours of core-traffic outage.

Why code review doesn't catch it

  • The migration team doesn't own the downstream consumer code.
  • The downstream team didn't write the migration.
  • The query is syntactically valid in both worlds.
  • Pre-migration test data has the old visibility; post-migration test data has the new visibility; tests run against one or the other, not the intersection.
  • The implicit assumption is in the absence of a filter — the easiest thing for a reviewer to miss.

Mitigations

  • Explicit filters in metadata queries — always WHERE database = ? when querying system.columns / information_schema.columns.
  • Ingest validation on the downstream side — treat the query result as untrusted; validate shape / size / cross-field invariants before loading into a load-bearing data structure. See patterns/harden-ingestion-of-internal-config.
  • Shadow runs across the migration boundary — run the downstream consumer against both grant states during rollout; alarm on result-shape drift.

Seen in

Last updated · 200 distilled / 1,178 read