Skip to content

CONCEPT Cited by 1 source

ULID (Universally Unique Lexicographically Sortable Identifier)

A ULID is a 128-bit time-ordered identifier designed (Alizain Feerasta, 2016) as a UUID alternative with three design goals: (1) same 128-bit width as a UUID so it fits the same storage shapes; (2) lexicographically sortable — string-wise < ordering matches temporal ordering; (3) case-insensitive 26-character representation using Crockford base32. Layout: 48 bits of Unix-epoch millisecond timestamp + 80 bits of randomness. (Source: sources/2026-04-21-planetscale-the-problem-with-using-a-uuid-primary-key-in-mysql.)

Bit layout

Bits Field Meaning
48 timestamp Unix epoch ms
80 random CSPRNG, 1.21e24 combos per ms

Total: 128 bits, same as a UUID.

Canonical representation

ULIDs serialise as 26 characters of Crockford base32:

01HQF2QXSW5EFKRC2YYCEXZK0N
│         ││                │
│         │└── 80 bits random (16 chars)
│         └── '-' separator elided; ULIDs don't use hyphens
└── 48 bits timestamp (10 chars)

Sample from the PlanetScale post: 01HQF2QXSW5EFKRC2YYCEXZK0N

Properties of the Crockford base32 encoding: - No I, L, O, U — avoids visual ambiguity with 1, 0. - Case-insensitive. - URL-safe (no special characters). - 5 bits per character → 26 chars = 130 bits > 128 bits, so the encoding is loss-free.

Why it's useful as a primary key

  1. Lexicographic sort = temporal sort. Store as a CHAR(26) or BINARY(16) and byte-wise comparison places newer values to the right. B+tree inserts land on the right-most path — same locality as BIGINT AUTO_INCREMENT.
  2. 128-bit uniqueness across systems. 80 bits of randomness per ms → collision probability is vanishingly small even at millions of generations per second per generator.
  3. Readable as a string. Humans can eyeball the timestamp prefix (first 10 chars = 48-bit base32 timestamp); debuggable in logs without decoding.
  4. No machine-ID coordination. Unlike Snowflake, ULIDs don't require assigning unique machine IDs at fleet boot.

vs UUIDs

Property UUIDv4 UUIDv7 ULID
Width 128 bits 128 bits 128 bits
Timestamp epoch None Unix Unix
Timestamp resolution N/A ms ms
Time-ordered by byte No Yes Yes
Time-ordered by string No Yes (hex) Yes (base32)
Case-insensitive string Yes (hex) Yes (hex) Yes (Crockford base32)
String length 36 chars (CHAR(36)) 36 chars 26 chars
Hyphenated Yes Yes No
RFC-standardised Yes Yes No (de facto spec only)

ULID wins on string compactness (26 vs 36 chars → saves 10 bytes per stored representation) and string- sortability (UUIDv7's hex is sortable but hyphens break some collations). UUIDv7 wins on standardisation and client-library ubiquity.

vs Snowflake

ULID is 128 bits; Snowflake is 64 bits. ULID wins on collision resistance (no machine- ID registration required). Snowflake wins on storage (fits in BIGINT, half the key width).

Storage in MySQL

Three choices:

  • CHAR(26) — human-readable, 26 bytes. Sortable lexicographically by byte.
  • BINARY(16) — compact, 16 bytes. Sortable bytewise. Requires encode/decode on read/write — no MySQL built-in (unlike UUID_TO_BIN).
  • VARBINARY(16) — same as BINARY(16) except variable-length; no advantage for fixed-width ULIDs.

The 80-bit-random suffix still triggers mild page-fill degradation within a millisecond — but not across milliseconds, so the 94% vs 50% page-fill asymmetry (see concepts/innodb-page-fill-factor) does not apply at the coarse scale.

Monotonic ULIDs

The base ULID spec allows same-millisecond values to be ordered randomly — two ULIDs generated in the same ms have a 50% chance of being out of order when sorted by byte. Monotonic ULIDs (an extension from the same spec) treat the random bits as a monotonically- incrementing counter within a single millisecond: the first ULID of the ms seeds a random 80-bit value; each subsequent ULID in the same ms increments that value by 1. Result: byte-wise sort is strictly temporal even within a millisecond.

Nearly every production ULID library implements monotonic mode by default.

Caveats

  • Not an RFC. ULID is specified on GitHub (ulid/spec) with no IETF standardisation. UUIDv7 is RFC-standardised and should be preferred for new systems where standardisation matters.
  • MySQL has no built-in helpers. Unlike UUID_TO_BIN() for UUIDs, MySQL doesn't ship ULID-specific conversion functions. Encode/decode lives in application code.
  • Case-insensitive but not case-agnostic on disk. MySQL BINARY(16) is case-preserving; ULIDs stored as strings need a case-normalising collation or an application-layer normalisation step to avoid duplicate inserts differing only in case.
  • 80 bits of randomness per ms is overkill for most. If you have at most 1000 ULIDs/ms/generator, the per-ms collision probability is 10^-20. Most of the 80 bits are wasted on collision-resistance most systems don't need.
  • No machine-ID traceability. Unlike Snowflake, you can't tell which machine minted a ULID — the random suffix is random, not structured.
  • 48-bit timestamp ceiling: 10889 AD. Practically no ceiling.
  • Ecosystem fragmentation. Many libraries, each with slight differences on monotonic mode, timestamp resolution, and error handling. Audit the library before committing.

Seen in

Last updated · 470 distilled / 1,213 read