PlanetScale — Enhanced tagging in Postgres Query Insights¶
Summary¶
Rafer Hazen (PlanetScale) announces enhanced query tagging in
PlanetScale Insights for
Postgres, shipped alongside the
Traffic Control launch.
Insights has supported SQLCommenter-style query tags
for some time, but tags were previously only attached to the
individual-query stream (notable queries — any query that
took > 1 second, read > 10k rows, or errored). This release extends
tagging to the aggregate summary stream (per-query-pattern
statistics emitted every 15s), enabling three new surfaces: (1) tag
distributions per query pattern, (2) database-wide statistics
broken down by tag, (3) tag-filtered query-pattern lists via
tag:MY_TAG:MY_VALUE. The architectural core is two cardinality-
reduction strategies that keep the aggregate stream from
exploding when tags are high-cardinality (like request_id): a
per-query-pattern key-cardinality tracker (> 20 distinct values
for a tag key on a pattern ⇒ collapse to * for 1 hour) and a
per-interval tag-combination limit (> 50 unique combinations in
one 15-second interval ⇒ greedily collapse the highest-cardinality
tag until under threshold). Collapsed status is recorded on every
aggregate so the UI can show "the percentage of tag values where
the value is unknown."
Key takeaways¶
- Two query-data streams power Insights — per-query messages for the notable-query tail (> 1s duration OR > 10k rows read OR error) and per-query-pattern aggregate summaries emitted every 15s. Before this release, only the notable-query stream carried tags; aggregates were tag-blind. (Source: sources/2026-04-21-planetscale-enhanced-tagging-in-postgres-query-insights)
- Aggregate-stream tagging unlocks three question classes that notable-query tagging alone couldn't answer: "What queries has this user executed?" (including cheap/fast ones not in the notable tail), "What percentage of my total query run time is coming from this controller?" (requires attributing aggregate time per tag value), "Which background jobs are executing this query?" (mapping query pattern → set of callers).
- Per-unique-tag-combination messaging is the chosen aggregate
shape — emit one aggregate message per distinct combination of
tag key-value pairs observed for a query pattern in the 15s
window. The rejected alternative — single aggregate message with
a
tags: {"controller=users": 1, "controller=sessions": 2}counter map — cannot attribute aggregate statistics to any individual tag value, "the summary data for one tag is permanently combined with the data from all tags." - Per-unique-combination messaging doesn't scale without
cardinality control. Worst case: customer tags every query
with a unique
request_id→ every execution becomes its own aggregate message, destroying the aggregation benefit. "This would be prohibitively expensive to process and store, and would consume considerable resources on the database host where telemetry data is emitted." - Per-query-pattern key-cardinality tracker (first collapse
mechanism): count distinct values per tag key per query pattern;
if count exceeds 20, mark that key as collapsed for that
pattern for 1 hour; all queries for that pattern during the
window carry
tag_key=*instead of the specific value. Catches inherently high-cardinality keys likerequest_idoruser_id. - Per-pattern scoping is load-bearing: tracking cardinality
globally (across all query patterns) would collapse
high-cardinality-but-highly-correlated-with-pattern tags like
source_location(file + line number of the query call site) unnecessarily —source_locationis high-cardinality globally but typically has 1-2 values per query pattern and doesn't inflate the aggregate message count. Monitoring per-pattern lets it pass through uncollapsed. - Per-interval tag-combination limit (second collapse
mechanism): even individually low-cardinality tags can combine
explosively — 6 tag keys × 10 values each = 10⁶ possible
combinations per query pattern. Within each 15-second interval,
track aggregates keyed by the full tag-combination set; if any
pattern exceeds 50 combinations, iteratively find the
highest-cardinality tag among those aggregates and collapse it
(replace all values with
*), merging the now-identical aggregates until under the threshold. - Collapse is recorded on the wire: emitted aggregate messages carry a flag indicating that a specific key was collapsed; the UI surfaces "the percentage of tag values where the value is unknown" so users know when tag-based breakdowns are incomplete. Collapse is lossy-by-design but observable.
- Three user-facing surfaces (all enabled by aggregate-stream
tagging): (a) Query-pattern details page shows tag
distribution for one pattern — list of tag values with
percentage of queries that included each value;
(b) Tags section in Insights sidebar shows database-level
aggregate statistics broken down by tag (or set of tags);
(c) Query-pattern search via
tag:MY_TAG:MY_VALUEreturns query patterns filtered to queries carrying that tag pair. - Automatic tags added by Insights (emitted by the extension
itself, not the application):
application_name(Postgres driver-set),username(Postgres user),remote_address(remote IP). These join application-set tags (controller,action,job,source_location) in the same aggregation pipeline.
Systems / concepts / patterns extracted¶
Systems¶
- systems/planetscale-insights — the Postgres extension, the canonical consumer of these tags; gains aggregate-stream tag surfaces as part of this release.
- systems/planetscale-traffic-control — co-launched feature; uses the same SQLCommenter tag substrate for workload-class classification; aggregate tagging enables per-tag observability of Traffic Control's workload classes.
- systems/postgresql — the host database; Insights extension runs inside the Postgres binary.
- systems/kafka — the two-topic pipeline backbone (individual queries → Kafka topic 1; aggregate summaries → Kafka topic 2).
- systems/clickhouse — the aggregate storage engine Kafka feeds into; powers query tables, anomaly detection, and query-related graphs.
Concepts (new)¶
- concepts/aggregate-tag-attribution — the new capability unlocked: attribute aggregate statistics (total runtime, row counts, execution count) to individual tag values.
- concepts/tag-cardinality-collapse — replacing specific tag
values with
*to bound the number of distinct aggregate messages produced. - concepts/per-pattern-tag-cardinality — scoping
cardinality tracking to a single query pattern so that tags
correlated with query pattern (
source_location) don't get collapsed globally. - concepts/per-interval-tag-combination-limit — capping the number of unique tag combinations per query pattern per 15s window via greedy highest-cardinality collapse.
Concepts (extended)¶
- concepts/query-tag-filter — the same
tag:key:value/tag:keyfilter syntax now operates across aggregate-stream data (previously only per-error and per-notable-query). - concepts/actor-tagged-error — sibling surface using the same tags on error occurrences; pre-existing before this release.
- concepts/observability — general concept; aggregate-stream tagging is a canonical new instance of trading storage for attribution.
Patterns (new)¶
- patterns/dynamic-cardinality-reduction-by-tag-collapse — the two-mechanism collapse strategy (per-pattern key tracker + per- interval combination limit with greedy highest-cardinality collapse).
- patterns/dual-stream-telemetry-pipeline — the architectural shape of Insights' backbone: separate emission paths for the individual-query tail vs aggregate-summary bulk, each tuned independently for volume, fidelity, and retention.
Patterns (extended)¶
- patterns/actor-tagged-query-observability — pre-existing pattern; aggregate-stream tagging extends its reach from the notable-query tail + error stream to all queries in aggregate, substantially widening the attribution surface.
Operational numbers¶
- Notable-query capture threshold (pre-existing, restated): query read > 10,000 rows OR duration > 1 second OR resulted in an error → individual-query message emitted.
- Aggregate-summary cadence: 1 aggregate message per query pattern (pre-this-release) / per unique tag combination per query pattern (post-this-release), every 15 seconds.
- Per-query-pattern key-cardinality threshold: 20 distinct
values per tag key per query pattern. Exceeding this collapses
the key to
*for 1 hour on that pattern. - Per-interval tag-combination threshold: 50 unique combinations per query pattern per 15s interval. Exceeding this triggers greedy collapse of the highest-cardinality tag until under threshold.
- Worked example (from the post): five executions of one
pattern with 2 distinct
controllervalues and 4 distincthostvalues = 5 combinations. Collapsinghost(highest cardinality, 4 values) leaves 2 aggregates (one percontrollervalue).
Architecture¶
Postgres extension (Insights)
├── Individual queries topic (Kafka) [pre-existing: tags carried here]
│ └── queries read >10k rows OR >1s OR errored
│ → Notable queries UI
│ → Error-occurrences UI
└── Aggregate summaries topic (Kafka) [this release: tags now carried here]
└── per-query-pattern, per-unique-tag-combination, every 15s
├── Cardinality-reduction layer
│ ├── Per-pattern key-cardinality tracker (>20 values → collapse key for 1h)
│ └── Per-interval combination limit (>50 combos → greedy highest-card collapse)
└── Collapse status flag on every message
→ ClickHouse
→ Query table + anomalies + graphs
→ Tag distribution on pattern details page
→ Tags section in sidebar (DB-wide stats by tag)
→ Search: tag:MY_TAG:MY_VALUE
Aggregate stream powers "the majority of Insights including the query table, anomalies, and all query-related graphs." Individual stream powers the notable-query tail + error occurrences.
Caveats¶
- No production numbers: no disclosure of what percentage of query patterns hit either cardinality threshold in PlanetScale's fleet, no collapse-rate telemetry, no p99 latency impact of the cardinality-reduction layer on the emission path, no storage savings vs a no-collapse baseline.
- Thresholds stated as chosen values, not derived: 20 / 1h / 50 are stated as "currently" values without derivation. No guidance on tuning for different customer profiles.
- Collapse fairness unaddressed: when the per-interval limit
triggers, which tag value losses are visible to which users?
If one query pattern's
request_idis collapsed while another's isn't, is that a UX inconsistency? - Collapse recovery is coarse: the per-pattern key-cardinality tracker collapses for 1 hour; if the cardinality was transient (e.g. a load test ended), the customer can't see the specific values for an hour even though the underlying cardinality is now low. No disclosed mechanism for fast-recovery or manual reset.
- Collapse surfacing is quantitative-only: "percentage of tag values where the value is unknown" — the UI shows the fraction but not the identity of the collapsed key, which could be inferred from context but isn't made explicit.
- Workers vs DB host emission-cost trade-off implied: "would consume considerable resources on the database host where telemetry data is emitted" — hints that the cardinality- reduction layer runs inside the Postgres extension (in the DB host's process space), which has implications for the extension's memory footprint, but the mechanism isn't detailed.
- Application-emitted tags only partially enumerated: the post
lists
controller,action,job,source_locationillustratively — these are common, not exhaustive. The full set is application-author's choice. - No comparison to alternative observability products: Datadog DBM, pganalyze, Percona PMM, pg_stat_statements — each has its own tagging story. No positioning.
Source¶
- Original: https://planetscale.com/blog/enhanced-tagging-in-postgres-query-insights
- Raw markdown:
raw/planetscale/2026-04-21-enhanced-tagging-in-postgres-query-insights-2429d42a.md
Related¶
- systems/planetscale-insights
- systems/planetscale-traffic-control
- systems/planetscale-for-postgres
- systems/postgresql
- systems/kafka
- systems/clickhouse
- concepts/aggregate-tag-attribution
- concepts/tag-cardinality-collapse
- concepts/per-pattern-tag-cardinality
- concepts/per-interval-tag-combination-limit
- concepts/query-tag-filter
- concepts/actor-tagged-error
- concepts/observability
- patterns/dynamic-cardinality-reduction-by-tag-collapse
- patterns/dual-stream-telemetry-pipeline
- patterns/actor-tagged-query-observability
- companies/planetscale