CONCEPT Cited by 1 source
Snowflake ID¶
A Snowflake ID is a 64-bit time-ordered identifier
originally designed by Twitter (2010) for generating unique
post IDs across a sharded fleet without coordination. The
bit layout packs a millisecond timestamp + a machine ID +
a per-millisecond sequence number into a single
BIGINT-compatible value, so inserts into a clustered-
index B+tree retain the sequential-insert locality of an
auto-incrementing integer while also being uniquely
generatable from any machine in the fleet without central
coordination.
(Source:
sources/2026-04-21-planetscale-the-problem-with-using-a-uuid-primary-key-in-mysql.)
Bit layout¶
Twitter's canonical Snowflake layout (64 bits total):
| Bits | Field | Meaning |
|---|---|---|
| 1 | sign | always 0 (keeps value positive for signed BIGINT) |
| 41 | timestamp | ms since a custom epoch (e.g. Twitter's 2010-11-04) |
| 10 | machine ID | 5 bits datacenter + 5 bits worker |
| 12 | sequence | per-ms counter, resets every ms |
Sample from the PlanetScale post:
This is a 64-bit integer — fits in MySQL's BIGINT
(signed) or BIGINT UNSIGNED. Compare to UUIDs: 128 bits
(2× wider), stored as BINARY(16) or CHAR(36).
Why it's a good primary key¶
- Time-ordered. Byte-wise sort = temporal sort. Inserts
land on the right-most path of a clustered-index B+tree
— same locality property as
BIGINT AUTO_INCREMENT. - Half the size of a UUID. 8 bytes vs 16. Halves the B+tree key width, increases fan-out, halves secondary-index PK-amplification overhead. See concepts/uuid-primary-key-antipattern.
- Distributed-generatable. Machine ID guarantees no collision across the fleet without coordination — as long as machine IDs are unique.
- Fits a machine word. Every comparison is one 64-bit CPU instruction — same as a regular integer.
Limits¶
- ~70-year timestamp ceiling. 41 bits of ms is 2^41 ms ≈ 70 years from the custom epoch. Most deployments choose an epoch near their launch date to maximise remaining lifetime.
- 1024 unique machine IDs. 10 bits = 1024 workers fleet-wide. Large fleets need wider machine-ID fields or hierarchical allocation.
- 4096 IDs per millisecond per machine. 12 sequence bits per ms. Exceeding this rate requires blocking, bumping the timestamp forward, or falling back.
Generation¶
Not standardised as an RFC — multiple implementations with different bit allocations:
- Twitter Snowflake (2010) — the original, open- sourced but Twitter stopped maintaining it.
- Discord Snowflake — same 64-bit shape, different epoch (2015-01-01), different machine-ID split (5/5/12 → 5/5/12).
- Instagram shard IDs — similar shape, 13-bit shard ID instead of datacenter+worker.
- Sony's Sonyflake — 39-bit timestamp at 10 ms resolution, 16-bit machine ID, 8-bit sequence — trades ms resolution for larger fleets.
- YouTube video IDs — base64-encoded 64-bit, not strictly time-ordered internally.
- Mastodon uses the same 64-bit ID everywhere
for
status_id/account_id— same trick.
vs UUIDs¶
| Property | UUIDv4 | UUIDv7 | Snowflake ID |
|---|---|---|---|
| Width | 128 bits | 128 bits | 64 bits |
| Time-ordered | No | Yes | Yes |
| Distributed-generatable | Yes | Yes | Yes (with machine ID) |
| Traceable to generator | No | No | Yes (machine ID) |
| Standardised | RFC 4122 | RFC 4122 (draft → ratified 2024) | No standard |
| B+tree insert locality | Bad | Good | Good |
Fits in BIGINT |
No | No | Yes |
| Client-library support | Every language | Emerging | Per-implementation |
Snowflake wins on width (half the storage) and
already-ubiquitous DB support (every language has
BIGINT). UUIDv7 wins on standardisation, opacity
(no machine-ID leak), and client-library generality.
vs other alternatives¶
- concepts/ulid-identifier — 128-bit, string- representation by default. Strictly wider than Snowflake but retains UUID-like opacity.
- concepts/nanoid-identifier — URL-safe random string; no timestamp; PlanetScale's choice for their API.
Caveats¶
- Clock-skew sensitivity. If a machine's wall clock goes backwards (NTP jump, leap second correction), a naive generator produces IDs with timestamps in the past — which can collide with previously-generated IDs or violate monotonicity. Production implementations detect clock-skew and either block until the clock catches up or use a fallback counter.
- Machine-ID allocation is a distributed-systems problem. Getting unique machine IDs to every generator at boot requires a registry (ZooKeeper, etcd, Consul, a managed DB sequence) — coordination overhead that UUIDs avoid.
- Not globally unique across organisations. Snowflake IDs only guarantee uniqueness within a single fleet's machine-ID namespace. UUIDs are unique across independent systems.
- Timestamp epoch is custom. Comparing IDs across two systems requires knowing each system's epoch. This is an operational gotcha during migrations or forensic analysis.
- Not browser-friendly. JavaScript numbers are IEEE-754 doubles and can't represent all 64-bit integers precisely above 2^53. Serialise Snowflake IDs as strings on the wire for web APIs — often with base62 or base64 encoding.
- Bit layout isn't standardised. Different systems use different splits, different epochs, different resolutions — the 64-bit shape is the only thing they agree on.
Seen in¶
- sources/2026-04-21-planetscale-the-problem-with-using-a-uuid-primary-key-in-mysql
— Brian Morrison II (PlanetScale, 2024-03-19) names
Snowflake IDs as the canonical BIGINT-fitting UUID
alternative with a sample value (
7167350074945572864), alongside ULIDs and NanoIDs.
Related¶
- concepts/uuid-version-taxonomy — UUID alternative family
- concepts/uuid-primary-key-antipattern — what Snowflake IDs avoid
- concepts/ulid-identifier, concepts/nanoid-identifier
- patterns/sequential-primary-key — the locality pattern
- systems/mysql