Skip to content

PLANETSCALE 2024-03-19

Read original ↗

PlanetScale — The problem with using a UUID primary key in MySQL

Summary

Brian Morrison II (PlanetScale, originally 2024-03-19, re-fetched 2026-04-21) publishes a pedagogical companion to Ben Dicken's later 2024-09-09 B-trees and database indexes post. Where Dicken canonicalises why random UUIDs punish a B+tree, Morrison canonicalises what the eight official UUID versions actually are, how MySQL stores them, and which workarounds exist — the forward-pointer Dicken defers to. Three classes of contribution: (1) the full UUID v1–v8 taxonomy with byte-layout and intended-use framing — v1 time+node, v2 time+POSIX-UID (DCE Security), v3 namespace+name MD5, v4 random, v5 namespace+name SHA-1, v6 v1-with-reordered-timestamp-for-sortability, v7 Unix- epoch-time-ordered with random node, v8 vendor-specific RFC-compliant escape hatch. (2) MySQL-specific storage + insertion costs: CHAR(36) = 288 bits / 36 bytes, ~9× larger than a 32-bit integer; BINARY(16) = 128 bits / 16 bytes, still 4× a 32-bit INT; InnoDB fills pages to ~94% of page size on sequential-key inserts but as low as ~50% on random-key inserts; this nearly doubles the page count required to store an index of the same size when UUIDs are random. (3) MySQL's UUID_TO_BIN(@uuid, 1) swap-flag: rearranges the UUIDv1 byte order to put the high-order timestamp bits first, producing a COMB-UUID-like sequential-insert shape without application changes. The post also names Snowflake IDs, ULIDs, and NanoIDs as non-UUID alternatives (with specific sample values for each) and discloses that PlanetScale itself uses NanoIDs for its API.

Key takeaways

  1. UUIDs have eight official versions (5 ratified, 3 draft at time of writing). "At the time of this writing, there are five official versions of UUIDs and three proposed versions." Each version layout carries the version number in the first position of the third 4-hex-digit segment; the variant / reserved bits in the fourth segment control bit-packing rules. Source: the post walks each version byte-by-byte in order v1, v2, v3/v5, v4, v6, v7, v8. See concepts/uuid-version-taxonomy.

  2. UUIDv1 is time + node, but backwards for sort order. "A UUID version 1 is known as a time-based UUID." The timestamp is counted in 100-nanosecond increments from Oct 15, 1582 (the date the Gregorian calendar started to be widely used — not Unix epoch); the three segments time_low / time_mid / time_hi store the timestamp; the final segment is the node ID — the MAC address or derived identifier of the system that generated the UUID. Sort-order problem: "UUIDv1, which has the least significant portion of the timestamp first." Byte-wise lexicographic sort does not match temporal order without byte-reordering.

  3. UUIDv2 is effectively deprecated. "Since the low_time segment is where much of the variability of UUIDs reside, replacing this segment increases the chance of collision. As a result, this version of the UUID is rarely used." v2 was DCE Security — it replaced low_time with a POSIX user ID so UUIDs could trace to the user account that generated them. Not recommended.

  4. UUIDv3 + v5 are deterministic (namespace + name). "The goal of these versions is to allow UUIDs to be generated in a deterministic way so that, given the same information, the same UUID can be generated." Input is a namespace UUID + a name string; output is a 128-bit hash. v3 uses MD5; v5 uses SHA-1 — only difference. Not useful as primary keys (no uniqueness guarantee if upstream generates same namespace/name); useful for content addressing.

  5. UUIDv4 is the random one. "Version 4 is known as the random variant because, as the name implies, the value of the UUID is almost entirely random. The exception to this is the first position in the third segment of the UUID, which will always be 4 to signify the version used." This is the version most ORMs and client libraries generate by default, and the version that triggers the full concepts/uuid-primary-key-antipattern.

  6. UUIDv6 = UUIDv1 with timestamp bits flipped for sort order. "Version 6 is nearly identical to Version 1. The only difference is that the bits used to capture the timestamp are flipped, meaning the most significant portions of the timestamp are stored first. … The main reason for this is to create a value that is compatible with Version 1 while allowing these values to be more sortable since the most significant portion of the timestamp is upfront." Backwards-compatible with v1 on decoding; sorts correctly by byte.

  7. UUIDv7 = Unix epoch + random node. "Version 7 is also a time-based UUID variant, but it integrates the more commonly used Unix Epoch timestamp instead of the Gregorian calendar date used by Version 1. The other key difference is that the node (the value based on the system generating the UUID) is replaced with randomness, making these UUIDs less trackable back to their source." First 48 bits are Unix epoch ms; rest is random. This is the recommended UUID version for new systems — sortable by byte, opaque origin, client-side generatable without coordination.

  8. UUIDv8 = vendor-specific RFC-compliant escape hatch. "Version 8 is the latest version that permits vendor-specific implementations while adhering to RFC standards. The only requirement for UUIDv8 is that the version be specified in the first position of the third segment as all other versions." Allows custom layouts for specialised use cases (e.g. embed shard ID + tenant ID + timestamp).

  9. InnoDB B+tree page fills to 94% on sequential keys, ~50% on random keys. "InnoDB assumes that the primary key will increment predictably, either numerically or lexicographically. If true, InnoDB will fill the pages to about 94% of the page size before creating a new page. When the primary key is random, the amount of space utilized from each page can be as low as 50%. Due to this, using UUIDs that incorporate randomness can lead to excessive use of pages to store the index." First wiki disclosure of the specific InnoDB page-fill percentages. See concepts/innodb-page-fill-factor. Near-doubling of total page count translates directly into 2× buffer- pool pressure + 2× storage footprint + 2× I/O to scan the same logical data.

  10. Storage-width ratios: CHAR(36) ≈ 9×, BINARY(16) ≈ 4× a 32-bit integer. "By default, an auto-incrementing integer will consume 32 bits of storage per value. Compare this with UUIDs. If stored in a compact binary form, a single UUID would consume 128 bits on disk. Already, that is 4x the consumption of a 32-bit integer. If instead you choose to use a more human readable string-based representation, each UUID could be stored as a CHAR(36), consuming a whopping 288 bits per UUID. This means that each record would store 9 times more data than the 32-bit integer." Canonical byte-wise storage argument for BINARY(16) over CHAR(36) when you must use UUIDs. See concepts/uuid-string-representation-mysql.

  11. Secondary indexes amplify the PK storage penalty. "Secondary indexes will also consume more space. This is because secondary indexes use the primary key as a pointer to the actual row, meaning they need to be stored with the index. This can lead to a significant increase in storage requirements for your database depending on how many indexes are created on tables using UUIDs as the primary key." Every secondary index leaf carries the full PK. Five secondary indexes × 20 extra bytes (UUID vs BIGINT) × N rows ≫ main-table penalty.

  12. MySQL's UUID_TO_BIN(@uuid, 1) swap-flag produces a COMB-UUID. "There is a helper function in MySQL called uuid_to_bin. Not only does this function convert the string value to binary, but you can use the option 'swap flag', which will reorder the timestamp portion to make the resulting binary more sequential." Worked example from the post — same UUIDv1 input d211ca18-d389-11ee-a506-0242ac120002:

    • Without swap: D211CA18D38911EEA5060242AC120002
    • With swap: 11EED389D211CA18A5060242AC120002

    The time_hi bytes (11EED389) now sit at the front; successive UUIDs generated within the same ~71-year epoch produce near-sequential byte prefixes → same right-edge-of-tree insert locality as a sequential integer. Caveat: MySQL's built-in UUID() function generates UUIDv1 only — so the swap flag is specific to MySQL-generated UUIDv1 values. Client-side-generated UUIDv4 values from application code don't benefit from the swap flag. See concepts/uuid-to-bin-swap-flag.

  13. Snowflake IDs, ULIDs, NanoIDs as non-UUID alternatives. "UUIDs are not the only type of identifier that provides uniqueness within a distributed architecture. Considering they were first created in 1987, there has been plenty of time for other professionals to propose different formats such as Snowflake IDs, ULIDs, or even NanoIDs (which we use at PlanetScale)." Sample values from the post:

    ID type Example Bit-width
    Snowflake ID 7167350074945572864 64 bits (fits in BIGINT)
    ULID 01HQF2QXSW5EFKRC2YYCEXZK0N 128 bits (Crockford base32)
    NanoID kw2c0khavhql 12 chars ≈ 72 bits (URL-safe alphabet)

    Each trades some of UUID's properties for different operational wins — see concepts/snowflake-id, concepts/ulid-identifier, concepts/nanoid-identifier.

  14. PlanetScale itself uses NanoIDs. First-party disclosure: "NanoIDs (which we use at PlanetScale)." Links to a separate post on why they chose NanoIDs for their API. Relevant for the wiki's companies/planetscale page as an operational choice datum.

Operational numbers

  • UUID epoch: Oct 15, 1582 (Gregorian-calendar start, not Unix epoch). UUIDv7 switches to Unix epoch.
  • UUID time resolution: 100 nanoseconds (v1, v6).
  • InnoDB page fill factor, sequential PK: ~94% of page size before split.
  • InnoDB page fill factor, random PK: as low as 50% of page size.
  • Primary-key byte widths:
  • 32-bit INT: 4 bytes (32 bits)
  • 64-bit BIGINT: 8 bytes (64 bits)
  • UUID as BINARY(16): 16 bytes (128 bits) → 4× INT
  • UUID as CHAR(36): 36 bytes (288 bits) → 9× INT
  • Swap-flag example (d211ca18-d389-11ee-a506-0242ac120002):
  • UUID_TO_BIN(@uuidvar): D211CA18D38911EEA5060242AC120002
  • UUID_TO_BIN(@uuidvar, 1): 11EED389D211CA18A5060242AC120002
  • UUID as alternative ID formats (all distributed-safe): Snowflake (64-bit BIGINT), ULID (128-bit base32 string), NanoID (~72-bit URL-safe string, used by PlanetScale).
  • UUID created: 1987 — per the post's own aside, "they were first created in 1987".

Caveats

  • Pedagogical voice, no fleet numbers. Brian Morrison II is PlanetScale's pedagogy-canon byline (prior ingests: Postgres→MySQL migration audit, declarative-schema, replication best practices, isolation levels, branching-vs-Aurora-blue-green); no production incident retrospectives, no customer fleet workload numbers, no page-split-rate measurements.
  • InnoDB 94%/50% page-fill numbers not cited. The post states them as architectural constants without pointing at MySQL source code or InnoDB documentation. They're broadly consistent with MySQL's default innodb_fill_factor behaviour on clustered-index inserts but the 50% floor is a worst-case not a guarantee.
  • UUIDv7 called out but not detailed. The post spends two sentences on v7 (Unix epoch + random node). UUIDv8 gets even less — one sentence. Operators choosing between v6 and v7 for a new system won't find enough here.
  • Swap-flag is MySQL-specific + UUIDv1-only. The UUID_TO_BIN(@uuid, 1) swap flag only rearranges UUIDv1 byte layout (it has knowledge of the time_low/ time_mid/time_hi segment positions). UUIDv4 values have no timestamp to swap. The post mentions "MySQL supports generating UUIDs directly within SQL; however, it only supports UUIDv1 values" — so the swap flag and MySQL's UUID() function together produce a COMB-UUID pipeline, but only for server-side-generated UUIDs.
  • No benchmarks. No INSERT/QPS numbers for UUIDv4 vs UUIDv7 vs BIGINT at any scale. The 94%/50% page-fill framing asserts 2× page count but doesn't measure insert rate, buffer-pool hit rate, or end-to-end query latency.
  • Snowflake/ULID/NanoID mechanics hand-waved. The three alternatives are named with sample values but neither their generation mechanism, collision properties, nor distributed-uniqueness guarantees are walked. "which we use at PlanetScale" is the only operational choice disclosure; the reasoning is deferred to a separate post.
  • Snowflake ID timestamp epoch, machine-ID layout elided. Twitter's original Snowflake layout (41-bit ms-precision timestamp + 10-bit machine ID + 12-bit sequence) is not disclosed. The post shows the output (7167350074945572864) without the bit-field breakdown.
  • CHAR(36) includes hyphens; 32 hex chars alone would be CHAR(32). The 288-bit / 36-byte framing counts the hyphens; operators using CHAR(32) save 4 bytes/row but the post doesn't walk this optimisation.
  • Secondary-index PK-amplification not quantified. Named structurally — every secondary index leaf stores the PK — but without sample numbers. A table with 5 indexes and 100M rows pays 5 × 100M × (20 extra bytes) = 10 GB extra vs BIGINT.
  • Page-fill impact under updates not addressed. Random inserts cause splits; UPDATEs on variable-length columns also split pages. The 94%/50% framing is about insert patterns only.
  • Re-ordering UUIDv1 via swap flag loses client-side-generation property. The swap flag is server-side only — the COMB-UUID effect requires running the server-side SQL function. Client-generated UUIDv4 from app code can't benefit; you'd need to switch to UUIDv6 or UUIDv7 instead (which are natively sortable by byte).
  • Doesn't address InnoDB row-level contention on right- edge insert. Sequential PKs concentrate inserts on the right-most leaf → lock contention on that page under high write concurrency. The post's recommendations for sequential-like UUIDs don't mention this trade-off.

Source

Last updated · 470 distilled / 1,213 read