CONCEPT

Snowflake ID¶

A Snowflake ID is a 64-bit time-ordered identifier originally designed by Twitter (2010) for generating unique post IDs across a sharded fleet without coordination. The bit layout packs a millisecond timestamp + a machine ID + a per-millisecond sequence number into a single BIGINT-compatible value, so inserts into a clustered- index B+tree retain the sequential-insert locality of an auto-incrementing integer while also being uniquely generatable from any machine in the fleet without central coordination. (Source: .)

Bit layout¶

Twitter's canonical Snowflake layout (64 bits total):

Bits	Field	Meaning
1	sign	always `0` (keeps value positive for signed `BIGINT`)
41	timestamp	ms since a custom epoch (e.g. Twitter's 2010-11-04)
10	machine ID	5 bits datacenter + 5 bits worker
12	sequence	per-ms counter, resets every ms

Sample from the PlanetScale post:

7167350074945572864

This is a 64-bit integer — fits in MySQL's BIGINT (signed) or BIGINT UNSIGNED. Compare to UUIDs: 128 bits (2× wider), stored as BINARY(16) or CHAR(36).

Why it's a good primary key¶

Time-ordered. Byte-wise sort = temporal sort. Inserts land on the right-most path of a clustered-index B+tree — same locality property as BIGINT AUTO_INCREMENT.
Half the size of a UUID. 8 bytes vs 16. Halves the B+tree key width, increases fan-out, halves secondary-index PK-amplification overhead. See concepts/uuid-primary-key-antipattern.
Distributed-generatable. Machine ID guarantees no collision across the fleet without coordination — as long as machine IDs are unique.
Fits a machine word. Every comparison is one 64-bit CPU instruction — same as a regular integer.

Limits¶

~70-year timestamp ceiling. 41 bits of ms is 2^41 ms ≈ 70 years from the custom epoch. Most deployments choose an epoch near their launch date to maximise remaining lifetime.
1024 unique machine IDs. 10 bits = 1024 workers fleet-wide. Large fleets need wider machine-ID fields or hierarchical allocation.
4096 IDs per millisecond per machine. 12 sequence bits per ms. Exceeding this rate requires blocking, bumping the timestamp forward, or falling back.

Generation¶

Not standardised as an RFC — multiple implementations with different bit allocations:

Twitter Snowflake (2010) — the original, open- sourced but Twitter stopped maintaining it.
Discord Snowflake — same 64-bit shape, different epoch (2015-01-01), different machine-ID split (5/5/12 → 5/5/12).
Instagram shard IDs — similar shape, 13-bit shard ID instead of datacenter+worker.
Sony's Sonyflake — 39-bit timestamp at 10 ms resolution, 16-bit machine ID, 8-bit sequence — trades ms resolution for larger fleets.
YouTube video IDs — base64-encoded 64-bit, not strictly time-ordered internally.
Mastodon uses the same 64-bit ID everywhere for status_id / account_id — same trick.

vs UUIDs¶

Property	UUIDv4	UUIDv7	Snowflake ID
Width	128 bits	128 bits	64 bits
Time-ordered	No	Yes	Yes
Distributed-generatable	Yes	Yes	Yes (with machine ID)
Traceable to generator	No	No	Yes (machine ID)
Standardised	RFC 4122	RFC 4122 (draft → ratified 2024)	No standard
B+tree insert locality	Bad	Good	Good
Fits in `BIGINT`	No	No	Yes
Client-library support	Every language	Emerging	Per-implementation

Snowflake wins on width (half the storage) and already-ubiquitous DB support (every language has BIGINT). UUIDv7 wins on standardisation, opacity (no machine-ID leak), and client-library generality.

vs other alternatives¶

concepts/ulid-identifier — 128-bit, string- representation by default. Strictly wider than Snowflake but retains UUID-like opacity.
concepts/nanoid-identifier — URL-safe random string; no timestamp; PlanetScale's choice for their API.

Caveats¶

Clock-skew sensitivity. If a machine's wall clock goes backwards (NTP jump, leap second correction), a naive generator produces IDs with timestamps in the past — which can collide with previously-generated IDs or violate monotonicity. Production implementations detect clock-skew and either block until the clock catches up or use a fallback counter.
Machine-ID allocation is a distributed-systems problem. Getting unique machine IDs to every generator at boot requires a registry (ZooKeeper, etcd, Consul, a managed DB sequence) — coordination overhead that UUIDs avoid.
Not globally unique across organisations. Snowflake IDs only guarantee uniqueness within a single fleet's machine-ID namespace. UUIDs are unique across independent systems.
Timestamp epoch is custom. Comparing IDs across two systems requires knowing each system's epoch. This is an operational gotcha during migrations or forensic analysis.
Not browser-friendly. JavaScript numbers are IEEE-754 doubles and can't represent all 64-bit integers precisely above 2^53. Serialise Snowflake IDs as strings on the wire for web APIs — often with base62 or base64 encoding.
Bit layout isn't standardised. Different systems use different splits, different epochs, different resolutions — the 64-bit shape is the only thing they agree on.

Seen in¶

— Brian Morrison II (PlanetScale, 2024-03-19) names Snowflake IDs as the canonical BIGINT-fitting UUID alternative with a sample value (7167350074945572864), alongside ULIDs and NanoIDs.

concepts/uuid-version-taxonomy — UUID alternative family
concepts/uuid-primary-key-antipattern — what Snowflake IDs avoid
concepts/ulid-identifier, concepts/nanoid-identifier
patterns/sequential-primary-key — the locality pattern
systems/mysql