Skip to content

PLANETSCALE 2025-03-11 Tier 3

Read original ↗

PlanetScale — Upgrading Query Insights to Metal

Summary

Rafer Hazen (PlanetScale, 2025-03-11, re-fetched 2026-04-21) publishes the production-migration capstone of the PlanetScale Insights pipeline corpus: the Insights backing database — 8 MySQL/Vitess shards serving "approximately 10k UPDATE/INSERT statements per second" from "32 consumer processes, each with 25 writer threads for a total max concurrency of 800 threads" reading Kafka — was migrated from EBS (with additional provisioned IOPS on the sharded keyspace to keep up with telemetry volume) to PlanetScale Metal. PlanetScale picked the busiest of the 8 shards to upgrade first, watched latency graphs at p50 / p90 / p95 / p99, and found "a substantial decrease in latency across all the measured percentiles" — the previously- worst shard became the fastest by a significant margin after the upgrade. After a multi-day soak, the remaining 7 shards were upgraded and saw "nearly identical improvement." The outcome: lower Kafka-consumer backlog and more capacity headroom for future Insights-volume growth — "without making any changes to our application, architecture, or sharding configuration." This is the wiki's first canonical production- migration datum for Metal that is explicitly I/O-latency-driven rather than IOPS-cap-driven, and the first canonical instance of the canary-shard substrate- migration pattern — upgrade the busiest shard first, let it soak, then complete the rollout — which is now canonicalised as a generic pattern for staged infrastructure cutovers across a horizontally- sharded fleet.

Key takeaways

  • Workload shape: write-heavy, 800-thread concurrency, IOPS-sensitive. "As of this writing, we execute approximately 10k UPDATE/INSERT statements per second. These writes come from 32 consumer processes, each with 25 writer threads for a total max concurrency of 800 threads." The pipeline pre-aggregates Kafka messages in memory per batch (coalescing to avoid unnecessary writes — see concepts/in-memory-coalescing-by-kafka-key) and hands writes off to a thread pool in each consumer. The 32 × 25 = 800 writer- thread fan-out is the first canonical wiki disclosure of the Insights consumer-side concurrency model; the 10k QPS is the aggregate load across all 8 shards. (Source: sources/2026-04-21-planetscale-upgrading-query-insights-to-metal.)

  • Pre-Metal posture was provisioned-IOPS on EBS. "The Query Insights PlanetScale database has 8 shards and, prior to our upgrade to Metal, we'd had to provision more IOPS to the EBS volumes backing MySQL in our sharded keyspace to keep up with the telemetry volume. Since this workload had demonstrated a sensitivity to I/O latency, we figured it would be a good candidate for upgrading to Metal." The pre-Metal stack already applied sharding-as-IOPS- scaling (8 shards spread the load) and provisioned-IOPS EBS upgrades on top — the IOPS-cost-cliff fix described in Dicken's 2024-08-19 post was not enough on its own for this workload because the binding constraint was I/O latency (per-write round-trip), not IOPS throughput. Canonicalised as concepts/io-latency-sensitive-workload — a distinct diagnosis axis from IOPS-saturation.

  • Canary shard pattern: upgrade busiest first, soak, then fleet- wide rollout. "To do this, we picked 1 of our 8 MySQL shards, the busiest one, to upgrade first. … After letting the first upgrade soak for a few days, we upgraded the remaining shards and saw nearly identical improvement in performance." Canonicalised as patterns/canary-shard-substrate-migration. Picking the busiest (not the quietest) shard maximises signal-to-noise — the worst-performing shard is the one where improvement is most visible on the per-shard percentile plots ("The purple line corresponds to our busiest shard, which was upgraded to Metal around 19:35"). The pattern inverts the naive cautious-rollout instinct (start with the least-important shard) in favour of maximum-signal-first. It composes with sharded fleet as the natural substrate: each shard is an independent failure domain already, so a per-shard substrate swap is a self-contained operation.

  • Outcome: latency improved at every percentile, busiest shard became fastest. "The following graphs show the query latency at various percentiles. The lines shows the latency for the 8 primaries of the Insights database. The purple line corresponds to our busiest shard, which was upgraded to Metal around 19:35. … Upgrading a test shard to Metal causes a substantial decrease in latency across all the measured percentiles. After the Metal upgrade, our busiest shard with the highest latencies started executing queries faster than the other shards by a significant margin." Four separate graphs — p50, p90, p95, p99 — all show the purple line dropping below the other seven after 19:35. The p99 improvement is the most load-bearing: tail-latency is where EBS's variance (the 250 μs floor plus occasional multi-ms spikes — see concepts/performance-variance-degradation) hurts a high-concurrency write pipeline most. The post does not publish absolute numbers (no "p99 dropped from X ms to Y ms" datum), only the direction and relative shape.

  • Downstream effect: lower Kafka-consumer backlog + capacity headroom. "This resulted in a lower average backlog in our Kafka consumers, and has given us additional capacity to handle increasing message volume in the future." The architectural chain: reduced per-write I/O latency → each writer thread finishes faster → 800 threads drain the Kafka batch faster → consumer backlog shrinks → spare capacity to absorb future volume growth. This is a concrete instance of back-pressure relief via substrate upgrade — the slow downstream (MySQL write path) was the upstream bottleneck (Kafka consumer lag).

  • No application, architecture, or sharding changes. "Without making any changes to our application, architecture, or sharding configuration, we were able to realize substantial performance improvements by upgrading to PlanetScale Metal." This is the load-bearing wiki claim for Metal's substrate-swap positioning: the same Insights code, the same 8-shard keyspace, the same Vitess routing, the same 32-consumer × 25-thread deployment — only the storage layer changed. It is the empirical instance of the substitution frame from the 2025-03-11 launch capstone ("Metal differs from the PlanetScale you already know well in exactly one way: We've substituted Amazon EBS and Google Persistent Disk with the fast, local NVMe drives available from the cloud providers") applied to PlanetScale's own internal production workload.

  • Self-dogfooding datum. The workload migrated here is the Insights pipeline itself — the database that stores the per-query-pattern time-series that all PlanetScale customers see in the Insights UI. PlanetScale runs Insights on PlanetScale, migrated it to Metal, and reports the result. This is the first canonical wiki datum of PlanetScale-on-PlanetScale-on-Metal (the 2023-08-10 sibling post architected the pipeline on PlanetScale-on-EBS; this post closes the substrate-migration loop 19 months later).

Operational numbers

Metric Value Source
Insights write rate (aggregate) ~10,000 UPDATE/INSERT/sec post body
Consumer processes 32 post body
Writer threads per consumer 25 post body
Max concurrent writer threads 800 post body (32 × 25)
Insights MySQL shards 8 post body
Shards migrated in first batch 1 (busiest) post body
Soak period between batches "a few days" post body
Shards migrated in second batch 7 (remaining) post body
Latency improvement "substantial" at p50/p90/p95/p99 post body
Outcome metric 1 Lower Kafka-consumer backlog post body
Outcome metric 2 Additional headroom for volume growth post body
App / architecture / sharding changes None post body

No absolute before/after latency numbers are published. The percentile graphs are the only visual evidence; they are relative-shape illustrations.

Architectural position

This post sits at the intersection of three wiki pipelines:

  1. Insights production story — the 2023-08-10 Hazen post architected the Insights pipeline on 8 EBS-backed MySQL shards; this 2025-03-11 post migrates that same pipeline to Metal. Capstone of the Insights substrate story.

  2. Metal production datum — most Metal coverage on the wiki is pre-launch framing (2025-03-13 I/O-devices post, 2025-03-18 EBS-failure-rate post, 2025-03-11 launch capstone) or post-launch benchmarks on proxy instance types (2025-10-14 Postgres-17-vs-18 on i7i). This is the first self-reported Metal production migration on a real internal PlanetScale workload with concurrency + workload- shape numbers disclosed.

  3. Canary-shard substrate-migration pattern — the "upgrade the busiest one first, soak, then the rest" playbook is a new canonical rollout pattern on the wiki, distinct from patterns/progressive-delivery-per-database (which progressively rolls to an increasing share of customers) and from patterns/operator-scheduled-cutover (which handles a single-cluster schema cutover).

Contributions to the wiki

  • New concept: concepts/io-latency-sensitive-workload — a workload that is bound by per-I/O round-trip latency rather than aggregate IOPS throughput. Diagnostic signal: provisioning more IOPS on the existing substrate does not fix the problem. Canonical fix: direct-attached NVMe substrate.

  • New concept: concepts/kafka-consumer-backlog — the lag between Kafka producer offset and consumer committed offset, used as a back-pressure signal on downstream storage throughput.

  • New pattern: patterns/canary-shard-substrate-migration — pick the busiest (highest-signal) shard from a horizontally- sharded fleet, migrate substrate first, soak, then roll out fleet-wide.

  • Extensions: systems/planetscale-metal (first internal- workload production migration datum), [[systems/planetscale- insights]] (substrate-migration capstone of the Insights DB story), systems/mysql, systems/vitess, systems/kafka (Kafka-consumer-backlog as downstream-storage back-pressure signal), systems/aws-ebs, [[patterns/direct-attached-nvme- with-replication]] (production-migration datum), [[patterns/ sharding-as-iops-scaling]] (IOPS sharding + provisioned-IOPS on EBS was still not enough — latency was the binding constraint), companies/planetscale (new Recent-articles entry).

Caveats

  • No absolute latency numbers. The 4 percentile graphs show direction and relative shape but no "p99 X ms → Y ms" datum. Operators sizing their own Metal migration cannot quote specific before/after values.
  • No cost disclosure. Provisioned-IOPS EBS → Metal pricing delta on this workload is not published. The 2024-08-19 Dicken post canonicalised the 11–13× cost advantage of sharded gp3 vs RDS+io1 at a hypothetical level; this post does not quantify the actual delta for the Insights workload.
  • No failure-mode narrative. The rollout is described as strictly linear-success. What happened during the migration window (cutover mechanics, any regression on the upgraded shard, reparent events) is not described.
  • "A few days" soak is unspecified. Readers building a similar playbook don't know if "a few days" means 2, 5, or 10 days and what specific metrics were watched during the soak.
  • Blast radius of a Metal-substrate bug is 1/8 of customer Insights. Not discussed. The canary-shard pattern derives its safety from this property (one shard is already a self- contained failure domain under sharding) but the risk surface during migration is not enumerated.
  • Workload-shape generalisation limits. The 800-thread, 10k-QPS, write-heavy profile is Insights-specific. Read-heavy workloads, small-volume OLTP, cache-hot workloads may not see the same "substantial" improvement — the 2025-03-11 Metal launch capstone lists four workload shapes where Metal pays (random reads over buffer pool, working-set-too-large, high-write-replica-lag, latency-intolerant); this post fits shapes (3) and (4) and does not generalise.
  • Short post: ~500 words of body. Three paragraphs of setup, one of thesis, four percentile-plot captions, two of outcome. The architectural density is concentrated but the essay does not venture into Metal-internals or into the details of the per-shard cutover mechanism.
  • Re-fetch timing. Originally 2025-03-11; re-fetched under a 2026-04-21 crawl. The post is ~13 months old at re-fetch; no retrospective framing (e.g. "since this migration, we've …") has been added by PlanetScale on the live page.

Source

Last updated · 470 distilled / 1,213 read