PATTERN

Sequential primary key¶

Problem¶

A clustered-index database (MySQL InnoDB, InnoDB-alike) stores the whole table in a B+tree keyed on the primary key. The primary-key choice therefore determines the physical on-disk layout of every row. Random primary keys (UUIDv4, v3, v5 — see concepts/uuid-primary-key-antipattern) cause:

Unpredictable insert paths (cache miss per insert).
Write amplification via node splits across the tree.
Range scans fanning out across non-adjacent leaves.

Solution¶

Use a monotonically-increasing primary key. Typical candidates:

BIGINT UNSIGNED AUTO_INCREMENT — MySQL's native ever-increasing integer.
An application-level monotonic ID generator (Snowflake, ULID, Twitter snowflake, Instagram ID, etc.).
UUIDv7 — time-ordered UUID where the first 48 bits are a millisecond timestamp, giving sequential insert locality while retaining UUID-like uniqueness properties.
Timestamp-prefixed composite keys like (created_at_sec, user_id, random_bits).

The key property: successive inserts produce monotonically increasing key values, so every insert lands on the right-most path of the B+tree.

Why it works¶

With a sequential PK, the article states three properties:

We always follow the right-most path when inserting new values. Leaves only get added on the right side of the tree. At the leaf level, data is in sorted order based on when it was inserted.

—

Mechanical consequences:

Hot working set is small and stable. Only the right-most path pages need to be cache-resident — roughly one page per tree level. The InnoDB buffer pool hit rate for inserts approaches 100% regardless of total table size.
Node splits concentrate at the right edge. Splits happen almost exclusively when the right-most leaf fills. Neighbouring internal-node splits are predictable and localised.
Range scans on the PK are sequential I/O. The B+tree leaf linked list traverses neighbouring leaves. A WHERE created > $START AND created < $END query on a sequential PK reads contiguous pages — same shape as a sequential scan.
Time-ordered reads match natural query shape. Most operational workloads (timelines, chat history, event streams, audit logs) are naturally time-sequenced; a sequential PK makes the physical layout match the read pattern by construction.

Worked comparison¶

Same insert workload, two schemas. From the article's bar-chart visualisation of "unique nodes visited for the previous 5 inserts":

Metric	Sequential PK	Random PK (UUIDv4)
Unique nodes visited per 5 inserts	Low, bounded	High, near-tree-depth × insert-count
Buffer-pool hit rate under memory pressure	Near 100%	Collapses as table exceeds RAM
Leaves modified per 5 inserts	~1	Up to 5
Range scan on PK	Sequential	Random fan-out

Caveats¶

Hot-spot at the right edge under high concurrency. All inserts target the same right-most leaf + ancestors. On write-heavy workloads, lock contention on that path can become the bottleneck — InnoDB's row-level locking + latch-free B+tree operations mitigate this in the common case but not at extreme write rates.
Sharding reverses the argument. When a sequential key is used as the shard key on a sharded system, all new writes go to one shard — see concepts/shard-key. Fix: hash the shard key so routing is uniform while still using a sequential PK within each shard.
Doesn't help multi-region writes. A single monotonic counter across regions requires coordination; region-local monotonic + region-prefix composite keys avoid this but cost some of the sequential-locality benefit.
AUTO_INCREMENT gaps exist. Failed transactions + restarts leave holes in the sequence — not a correctness issue but a surprise for users who expect dense IDs.
Still recommended to prefer smaller keys. BIGINT (8 bytes) beats UUIDv7 (16 bytes) on fan-out and secondary-index overhead; use UUIDv7 only if the global-uniqueness or client-side-generation properties justify the extra bytes.
Range scan anti-benefit on email / alphabetical columns. A secondary index on email behaves like a random-PK insert — users don't sign up in alphabetical order. The sequential-PK pattern is about primary key choice; secondary indexes on naturally-unsorted columns pay some of the same cost regardless.

When to break the rule¶

Global uniqueness across uncoordinated writers — UUID (including v7) or sharded monotonic IDs.
Hide row count in URLs — opaque IDs prevent enumeration attacks.
Merge-friendly datasets — two copies of the same table can be combined without ID conflicts.

For these, prefer UUIDv7 over UUIDv4 to keep B+tree locality.

Seen in¶

— Mike Coutermarsh (PlanetScale, 2022-03-29) canonicalises the two-ID realisation of this pattern: BIGINT AUTO_INCREMENT PRIMARY KEY as the internal sequential clustered PK + public_id VARCHAR(12) UNIQUE KEY as the externally-visible NanoID. "We still use standard auto-incrementing BigInts for our primary key. The public_id is only used as an external identifier." Full CREATE TABLE user schema
Rails PublicIdGenerator concern + Go publicid package. First-party disclosure that the 2024 UUID-PK post forward-references — this is where the public-id-alongside-BIGINT-PK pattern is canonically realised, with application-side ID generation + unique- secondary-index enforcement.
— Brian Morrison II (PlanetScale, 2024-03-19) canonicalises the 94% vs 50% InnoDB page fill-factor asymmetry that makes this pattern load-bearing for storage efficiency (near-doubling of page count on random PKs), plus the byte-width amplifiers (CHAR(36) at 9× INT, BINARY(16) at 4× INT). Names three alternative monotonic-ID families — Snowflake, ULID, NanoID — with PlanetScale disclosing their own use of NanoIDs for API identifiers (as external IDs over a BIGINT clustered PK, not as the clustered PK itself). Canonicalises MySQL's UUID_TO_BIN(@u, 1) swap flag as the server-side COMB- UUID helper for v1 values — see concepts/uuid-to-bin-swap-flag.
— canonical articulation of the pattern with the three consequences of sequential-PK inserts + worked visual comparison vs random PK. Extends to non-PK indexes (a created_at timestamp column behaves the same way on a secondary index).