Skip to content

CONCEPT Cited by 2 sources

Query telemetry as deploy-safety signal

Definition

Query telemetry as deploy-safety signal is the architectural pattern where the database platform uses its own runtime workload observations — which tables / columns / indexes have been queried recently, by whom, at what rate — as a guardrail on destructive schema-change deploys. The signal that "this table matters" is not an explicit developer declaration or a separate registry; it is the raw query log itself.

Canonical framing (Sam Lambert, PlanetScale CEO, 2022-08-02):

"To help prevent this type of outage, PlanetScale warns you if the table to be dropped was recently queried. This will help you avoid the mistake of dropping a table that is in use."

(Source: sources/2026-04-21-planetscale-how-planetscale-prevents-mysql-downtime)

What the signal answers

The structural question any destructive DDL raises is "will this break something that is currently alive?" Traditional answers — grep the codebase, ask the team, check for foreign-key references — are partial and lossy. The canonical failure shape is exactly:

  • Developer intends to drop old_deprecated_table.
  • Developer believes no live code references it.
  • Some cron job, dashboard query, or partner-integration code still issues SELECT … FROM old_deprecated_table once a day.
  • DROP TABLE succeeds. One hour later, the query fires, hits a dropped table, and causes the outage.

The platform has direct empirical evidence of whether the table is live: its own query log. Query telemetry answers "is this live?" with a factual yes/no (the table was read at 14:07:22 by analytics_service) rather than a probabilistic inference.

Variants of the signal

  • Recently queried — bare existence of any query against the table in the last N minutes/hours/days. The cheapest signal; Lambert's canonical framing.
  • Queried by actors — which principal / service / hostname ran the query. Enables "drop OK — only the dev-box you own touches it" vs "drop blocked — three production services still read this table" distinctions. Requires actor-tagged query capture.
  • Queried rate — queries/second against the table / column. Enables threshold-based severity ranking ("the table gets 50 qps — a drop is very high risk").
  • Query-pattern coverage — is the column touched by SELECT * only or by explicit SELECT col queries? The distinction drives invisible column- style migration patterns: a column touched only by SELECT * can be hidden safely; one touched by name breaks loudly.

Composition with gated deployment

Query telemetry as deploy-safety signal slots into gated-deployment flows as a late-stage advisory check — after schema diff validation, before the irrevocable cutover. PlanetScale implements this in the deploy-request UI ("recently queried" warning surfaced during the deploy request, not at DROP TABLE execution time). The developer sees the warning at review time and can either proceed (explicit acknowledgement) or back out before any destructive operation runs.

The warning is advisory, not blocking in the canonical PlanetScale framing — Lambert's phrasing is "warns you if the table to be dropped was recently queried", explicitly a warning rather than a hard block. The operator keeps the final call, but the platform surfaces the evidence.

Composition with invariant recovery

The deploy-safety signal is the preventive half of a two-part safety story. The recovery half is instant schema revert — pre-staged inverse replication that restores the old schema without data loss inside a 30-minute window after the mistake.

Together:

  1. Query telemetry warns at deploy time — "this table is in use, are you sure?" — catching the obvious mistakes before they ship.
  2. Inverse replication recovers after the mistake ships — "that was wrong, revert now" — catching the non-obvious mistakes that made it past the warning.

Lambert canonicalises both in the same post: "we also let you roll back the schema deployment without any loss of data." The two mechanisms are architecturally orthogonal but operationally paired — warn-before-deploy + revert-after-deploy is the full backstop.

Composition with rename-as-expand-migrate-contract

PlanetScale does not offer direct RENAME COLUMN DDL; renames are implemented as expand-migrate-contract under the hood. The rationale draws from the same principle: "PlanetScale does warn you if a table has been recently queried in a deploy request" ( Dicken). A direct rename is functionally a drop + add, and the deploy-safety signal applies identically: if the table is live, the rename is latent-break-risk.

Structural moat

Not every database vendor can surface this signal. It requires:

  • A per-query log at the platform layer capturing which tables / columns are touched per statement (most vendors have this via MySQL general log, Postgres log_statement=all, pg_stat_statements, etc — but interpreting by object graph requires platform infrastructure).
  • A connection from the deploy system to the query log — the deploy-request UI needs to query the workload telemetry synchronously at review time.
  • A schema-diff engine that can identify which tables / columns a deploy will touch so the platform can ask "is any of this live?"

Self-managed MySQL / Postgres has the raw material (the query log) but no productised connection between that log and the DBA's ALTER workflow. The platform build that PlanetScale ships — deploy-request + Insights + schema-diff integrated into a single UI — is the productised form.

Seen in

Last updated · 347 distilled / 1,201 read