Skip to content

PlanetScale — What makes up a PlanetScale Vitess database?

Metadata

Summary

Brian Morrison II's pedagogy-101 tour of the full PlanetScale-Vitess-MySQL stack — a mid-length, zero-diagram walkthrough of every architectural layer PlanetScale stacks on top of a customer database, structured as a linear traversal from substrate (Vitess + Kubernetes) up through edge routing, schema-change tooling, backups, replication topology, observability, and security surface. The post's single load-bearing claim, canonicalised verbatim at the top: "Every database and branch on PlanetScale is an independent cluster." This is the architectural substrate behind PlanetScale's already-canonicalised database branching model — branches are not database views or schemas within a shared cluster but entirely separate Vitess clusters, each with its own VTTablets, VTGates, and VTCtlds, created on demand by signalling PlanetScale's internal Kubernetes control plane. Schema is cloned from the source branch via the two clusters' vtctld instances talking to each other.

Key takeaways

  1. Every database and branch on PlanetScale is an independent Vitess cluster — verbatim the post's single load-bearing claim. Branches are not logical views inside one cluster but fully separate clusters, each with its own tablet fleet, provisioned via Kubernetes on demand. Schema clone-from-source happens via vtctld-to-vtctld coordination.
  2. Three Vitess primitives compose the cluster: (1) VTTablet — a mysqld instance with a sidecar vttablet process owning connection pooling + health checks ("Each Vitess cluster has at least one and it's where your data lives"); (2) VTGate"responsible for accepting queries and routing them to the proper tablet, breaking up the query, and dispatching it to multiple tablets if needed"; (3) VTCtld"a management interface that our internal systems communicate with to perform administrative operations" controlling the entire cluster. First canonical wiki disclosure of vtctld as a named peer of VTGate and VTTablet.
  3. Kubernetes is the orchestration substrate for PlanetScale's multi-tenant Vitess fleet. Customer signal → internal Kubernetes signal → new Vitess cluster provisioned with at least one vttablet pod, one vtctld pod, one or more vtgate pods + load balancing. Canonical framing of PlanetScale's custom-operator-over-statefulset pattern at pedagogy-101 altitude.
  4. Branching is schema-clone-via-vtctld, not replica-at-a-point-in-time: "when you create a branch, we'll spin up a new Vitess cluster for it and (using the vtctld component of the two clusters) apply the schema of the source database branch with the one you just created!" Load-bearing operational detail for understanding how PlanetScale branch creation differs from Aurora Blue/Green or point-in-time-clone approaches.
  5. MySQL protocol predates cloud-era assumptions about connection quality. PlanetScale's edge routing infrastructure terminates TLS at the node closest to the client, then proxies over PlanetScale's internal backbone to the database's home region — "the quality of the connections to the database are improved, resulting in lower latency and faster data access." Canonical instance of patterns/cdn-like-database-connectivity-layer applied to MySQL. Canonicalised at concepts/mysql-connection-termination-at-edge.
  6. Online schema changes are productised via Vitess + PlanetScale's deploy-request layer, not operator-run gh-ost / pt-online-schema-change workflows: "Instead of having to implement online schema change tools such as gh-ost or pt-online-schema-change yourself, PlanetScale provides this functionality out of the box." Shadow table mechanism canonicalised verbatim: "a hidden table that contains the updated schema of the original table. During the process, data is synchronized between the live and shadow tables. When the deploy request is applied, we flip the status of the two tables, so the shadow table becomes the live table, and vice versa."
  7. Schema reverts work by keeping the inverse-replication alive: "Schema reverts give you a window in which the old live table (now the shadow table) will remain in the system … Since data is still being synchronized between the two tables, any writes that have occurred during the revert window will be retained, but with the old schema." The post frames this from the product-UX side in one paragraph; the full mechanism lives on the 2022-10 schema-reverts internals post and patterns/instant-schema-revert-via-inverse-replication.
  8. Backup validation by restore-and-replay verbatim: "Whenever a backup occurs, the previous backup for that database is restored to a separate MySQL instance in the database architecture, then replicates the changed data into that instance before taking a new backup. This ensures that all backups are validated in PlanetScale as well to prevent data loss due to corrupted backups." Canonical pedagogy-101 framing of concepts/automated-backup-validation + patterns/validated-backup-via-restore-replay.
  9. Every production branch has ≥1 replica across multiple availability zones. Base-plan databases distribute Vitess components across multiple AZs in the selected region ("For example, whenever you create an EC2 instance in AWS in the us-east-1 region, you are prompted to select from 6 different AZs provided none others are enabled" — first canonical wiki disclosure of the 6-AZ us-east-1 number). On primary failure, "the Vitess instance running your database branch will automatically failover to one of the replicas and elect it as the new primary, preventing any downtime that would otherwise occur" — instance of patterns/multi-az-vitess-cluster.
  10. Query Insights is enabled by query-pattern normalisation: "Some databases in PlanetScale receive millions of queries per second. Instead of tracking the statistics on individual queries, we run all queries through a normalization process that allows them to be identified based on their patterns." "Through this normalization process, the query data is anonymized by default so we don't track query parameters or the data itself." Slow-query and errored-query details additionally captured for queries that "take more than 1 second to execute, read more than 10k rows, or result in an error" — pedagogy-101 restatement of the full fingerprint + threshold-based deep-capture architecture canonicalised in the 2023-04-20 Insights post.
  11. Dual monitoring stack: per-database MySQL/Vitess query telemetry (surfaced as Insights) plus a separate custom monitoring system for network-adjacent health"we also have custom monitoring systems built to ensure that traffic is flowing properly and the database can respond to queries as intended. This helps in avoiding situations where the database itself may be functioning perfectly fine, but upstream networking components or other infrastructure are impacted." Canonical pedagogy-101 statement of why observability beyond query telemetry matters at the control-plane / networking / edge-routing altitude.
  12. Security surface enumerated: standard MySQL connection strings with role tiers (read-only → full schema-changing); org-and-database-level granular user permissions; SSO via existing IdP; service tokens for programmatic API/pscale CLI access with their own permission scopes; GitHub Secret Scanning partner integration auto-revoking leaked service tokens or connection strings. First wiki canonicalisation of PlanetScale-as-GitHub-Secret-Scanning-partner.
  13. MySQL version-upgrade automation via Vitess + Kubernetes: "we're able to keep your database up to date with the latest version of MySQL, removing the difficulty of having to perform minor or major updates when new versions of MySQL are released. We're also able to make sure updates are applied successfully and easily roll back if needed."

Systems, concepts, and patterns extracted

Systems:

  • PlanetScale — the managed database product (frontmatter extended)
  • Vitess — the horizontal-scaling MySQL substrate (frontmatter extended)
  • VTTablet — the per-MySQL sidecar managing connection pool + health (already canonicalised; frontmatter extended)
  • VTGate — the query-routing proxy (already canonicalised; frontmatter extended)
  • VTCtldNEW SYSTEM PAGE: the Vitess cluster-management interface; "The entire cluster is controlled by a vtctld instance, a management interface that our internal systems communicate with to perform administrative operations."
  • Kubernetes — the orchestration substrate (already canonicalised; frontmatter extended)
  • MySQL — the bottom of the stack (frontmatter extended)
  • PlanetScale Insights — query observability (already canonicalised; frontmatter extended)
  • PlanetScale Global Network — edge-routing infrastructure (already canonicalised; frontmatter extended)
  • gh-ost + pt-osc — named as alternatives PlanetScale abstracts away (frontmatter extended)

Concepts (all already canonicalised; this post extends their Seen-in with pedagogy-101 altitude):

Patterns (all already canonicalised; this post extends their Seen-in with pedagogy-101 altitude):

Operational numbers

  • At least one VTTablet, one VTCtld, one-or-more VTGates per Vitess cluster (minimum topology)
  • At least one additional replica per production branch (default)
  • 6 AZs in AWS us-east-1 (verbatim pedagogy example) — first wiki disclosure of this specific number
  • Millions of queries per second on some PlanetScale databases (upper-end-customer framing)
  • >1 second execution time, >10k rows read, or error — thresholds triggering deep per-query capture (distinct from per-pattern aggregates)
  • Updated MySQL version availability without customer intervention
  • GitHub Secret Scanning partner — auto-revoke on leaked secret detection

Caveats

  • Pedagogy-101 altitude — no architecture diagrams, no production incident retrospective, no latency/throughput benchmarks, no customer case studies. The post is a linear what-is-stacked-on-top-of-what walkthrough, not a deep dive on any one layer.
  • Architecture density ~40% over ~1,800-word body — concentrated in the first half (Vitess + Kubernetes + branching + edge routing); second half (schema changes, backups, replicas, Insights, security) is brief per-layer paragraphs summarising mechanisms canonicalised elsewhere.
  • vtctld is the only new named primitive the post introduces — VTGate + VTTablet already had wiki pages from sibling Gangal / Morrison II / Van Dijk posts; vtctld did not.
  • The "6 AZs in us-east-1" framing is teaching-example, not PlanetScale-specific claim — PlanetScale does not necessarily use all 6 AZs for any given database; the number is illustrative of AWS's AZ granularity.
  • No disclosure of the Kubernetes flavour / operator — the 2023-09-27 Morrison II Scaling hundreds of thousands of database clusters on Kubernetes post canonicalises the full operator + custom-resource story; this 2023-08-23 predecessor names Kubernetes without naming vitess-operator or PlanetScale's internal control-plane forks.
  • Security surface enumeration is broad-brush — per-user permission model not detailed; SSO IdP list not given; service-token rotation policy not named; GitHub Secret Scanning partnership named without the GitHub-side architecture.
  • Online schema change presented at UX altitude — the post says "we support the concept of database branching and deploy requests" as if they are trivially enabled by Vitess, eliding the gh-ost-not-sufficient / six-skill-operational-tax framing canonicalised in Noach 2024-07-23 and the tenet-ten-is-sufficient framing canonicalised in Noach 2022-05-09.
  • Brian Morrison II's fourth wiki-earliest-date piece on the PlanetScale corpus (after sources/2026-04-21-planetscale-what-is-vitess-resiliency-scalability-and-performance|2022-10-21 What-is-Vitess and before sources/2026-04-21-planetscale-scaling-hundreds-of-thousands-of-database-clusters-on-kubernetes|2023-09-27 Kubernetes fleet post), straddling the what-is-Vitess pedagogy-101 altitude and the Kubernetes-fleet-architecture altitude.

Cross-source continuity

Scope disposition

Tier-3 on-scope at pedagogy-101 altitude. Brian Morrison II pedagogy voice on the PlanetScale Engineering blog; companion piece to his 2022-10-21 What-is-Vitess post and prequel to his 2023-09-27 fleet-architecture post. Architecture density ~40% by word count but the canonical wiki contribution is a named primitive disclosure (vtctld) the wiki lacked, plus a verbatim load-bearing claim ("Every database and branch on PlanetScale is an independent cluster") that the wiki's branching corpus had referenced implicitly without canonicalising. Borderline-case test ("Only skip if architecture content is <20% of the body") passes on substance — one new system page (vtctld) fills a definitional gap that sibling Vitess posts had left open; one verbatim claim canonicalises the substrate beneath the entire concepts/database-branching + systems/planetscale-workflows + systems/planetscale-portals corpus.

Source

Last updated · 470 distilled / 1,213 read