PATTERN Cited by 2 sources
Public ID alongside BIGINT PK¶
Problem¶
An application must expose opaque, stable, URL-friendly identifiers in its public API — so that:
- API consumers can't enumerate rows by sequential scan (
GET /users/1,GET /users/2, ...). - The API doesn't leak row counts ("we wouldn't reveal the count of records in all our tables").
- The identifier is opaque — clients don't depend on internal row ordering.
But the application also wants fast InnoDB clustered-index writes, which requires a sequential primary key (see concepts/uuid-primary-key-antipattern, patterns/sequential-primary-key). A random UUID / NanoID as the clustered PK collapses write performance through:
- Unpredictable insert paths → cache miss per insert.
- Scattered node splits → write amplification.
- ~50% page fill factor instead of ~94% on sequential keys.
You can't satisfy both properties with a single ID column.
Solution¶
Use two IDs per row:
- Internal PK:
BIGINT UNSIGNED AUTO_INCREMENTas the clustered-index primary key. Preserves B+tree insert locality. Never exposed to API consumers. - External ID: a
VARCHAR(N)column holding an opaque, randomly-generated identifier — typically a NanoID, Snowflake, ULID, or UUID — with aUNIQUE KEYsecondary index. Exposed in URLs, JSON payloads, emails.
Canonical schema¶
From PlanetScale's 2022-03-29 post (Source: sources/2026-04-21-planetscale-why-we-chose-nanoids-for-planetscales-api):
CREATE TABLE `user` (
`id` bigint NOT NULL AUTO_INCREMENT,
`public_id` varchar(12) DEFAULT NULL,
`name` varchar(255) NOT NULL,
`created_at` datetime(6) NOT NULL,
`updated_at` datetime(6) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `idx_public_id` (`public_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
Two properties:
- id BIGINT AUTO_INCREMENT PRIMARY KEY → monotonically-increasing clustered PK; inserts land on the right-most B+tree path; ~94% page fill factor.
- public_id VARCHAR(12) with UNIQUE KEY idx_public_id → NanoID (PlanetScale's choice; see concepts/nanoid-identifier) enforced unique at DB layer. Secondary-index B+tree takes the random-PK-style ~50% fill-factor hit — but it's a single index, not the whole clustered table, so the cost is bounded.
Generation and enforcement¶
The external ID is generated application-side before INSERT, not server-side. PlanetScale's Rails realisation (concepts/public-id-column):
module PublicIdGenerator
extend ActiveSupport::Concern
included do
before_create :set_public_id
end
PUBLIC_ID_ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyz"
PUBLIC_ID_LENGTH = 12
MAX_RETRY = 1000
def set_public_id
return if public_id.present?
MAX_RETRY.times do
self.public_id = generate_public_id
return unless self.class.where(public_id: public_id).exists?
end
raise "Failed to generate a unique public id after #{MAX_RETRY} attempts"
end
end
Collision handling is bounded-retry: 1000 regeneration attempts before giving up. At reasonable entropy levels (~60+ bits) the retry loop is effectively dead code but the correctness contract against statistical birthday-paradox accidents.
Parallel Go implementation for services that also need to issue public IDs:
const (
alphabet = "0123456789abcdefghijklmnopqrstuvwxyz"
length = 12
)
func New() (string, error) { return nanoid.Generate(alphabet, length) }
Cross-language ecosystem support is a first-order selection criterion when the backend mixes languages (PlanetScale runs Ruby + Go).
Structural properties¶
- Writes scale like sequential PK — inserts hit the right-most leaf of the clustered B+tree. Buffer-pool hit rate for writes approaches 100% regardless of table size.
- Reads by public_id cost one extra B+tree walk — primary-key lookup is
O(log N); public-id lookup isO(log N)on the secondary index +O(log N)on the clustered PK to fetch the row. 2× the logical lookup cost but stillO(log N). - API URLs are opaque, stable, and collision-resistant —
api.planetscale.com/v1/deploy-requests/izkpm55j334usatisfies the double-click-selectability property (alphanumeric, no hyphens) that UUID URLs break. - Row count is hidden — external observers can't infer table cardinality from consecutive public_ids (unlike
/users/1,/users/2). - Internal joins stay cheap — FK relationships still go through the
BIGINTPK.public_idis only surfaced at the API boundary; every internal table relationship uses the sequential integer.
Why it works¶
The pattern is "use the right ID for the right axis":
- Physical storage axis — wants monotonic increment (B+tree locality, page fill, cache hits).
- API surface axis — wants opacity + uniqueness + URL-friendliness + human-copyability.
These are orthogonal requirements that a single ID column can't satisfy. The pattern acknowledges the conflict and resolves it by separating concerns into two columns, each optimised for its own axis, connected by a unique-index B+tree walk.
Variations¶
| Variant | External ID | Rationale |
|---|---|---|
| PlanetScale API | 12-char NanoID, base-36 | Double-click-selectable, compact, cross-language generator ecosystem |
| GitHub API | Numeric, but surface-obfuscated | Row count visible but rate-limited; different threat model |
| Stripe API | prefix_ + opaque alphanumeric (e.g. cus_abc123) |
Prefixes encode object type; alphanumeric tail is effectively a NanoID |
| UUIDv7 | 36-char UUID | Self-sortable + standardised; loses double-click but gains interop |
| ULID | 26-char Crockford base32 | Time-ordered + URL-safe; uppercase ambiguity vs case-sensitivity |
PlanetScale's choice tunes heavily for developer experience of the URL (12-char, lowercase, alphanumeric, no hyphens); other vendors pick different points on the same design surface.
Caveats¶
- Secondary-index write cost is not free — every INSERT writes two B+tree branches: clustered PK (sequential, cheap) +
public_idunique index (random, ~2× the pages of a sequential-inserted index). The pattern isolates the random-write cost to the one secondary index rather than smearing it across the whole table, but doesn't eliminate it. - Index width —
VARCHAR(12)public_id is ~12-14 bytes + overhead per secondary-index entry. Still dramatically less than aCHAR(36)UUID (36 bytes) orBINARY(16)(16 bytes) index. See concepts/uuid-string-representation-mysql for byte-width comparisons. - Race condition on pre-insert
exists?check — the Ruby generator'swhere(public_id: ...).exists?is a read-then-write race. Two concurrent requests generating the same NanoID would both seeexists? == falseand both attempt INSERT; one would fail on the unique-index violation and raiseActiveRecord::RecordNotUnique. At ~60+ bit entropy the race is ~10^-16 per concurrent-insert pair; in practice the MySQL-sideUNIQUE KEYis the actual correctness backstop, not the application-sideexists?check. - Doesn't help with cross-DB migrations — if rows ever move across sharded databases or replicated to a warehouse, the
BIGINTPK is shard-local (AUTO_INCREMENT is not globally unique across shards). Downstream consumers must join onpublic_id, notid. This is the usual sharding-breaks-AUTO_INCREMENT caveat; NanoID saves you here as the globally-unique anchor. - Inbound API idempotency still needs care — NanoIDs don't give retry-idempotency for free. A POST that times out before returning the NanoID leaves the client unsure whether the row was created. Separate idempotency tokens (Stripe-style) are orthogonal to the public-ID pattern.
DEFAULT NULLis a footgun — the PlanetScale example'spublic_id varchar(12) DEFAULT NULLallows rows without public IDs if thebefore_createcallback is bypassed (raw SQL inserts, Rails seed scripts,activerecord-import). The unique-index allows multiple NULLs in MySQL. ConsiderNOT NULL+ a DB-generated default if application-generation is ever bypassed.- Changing the external-ID format is a backward-compatibility cliff — existing 12-char public_ids are in URLs, emails, Slack messages, screenshots, documentation, client-side caches. "Just increase the length" is hand-wave; in practice any format change is a multi-year dual-format migration.
- Doesn't prevent enumeration via timestamp joins — a determined attacker can still estimate row-creation rate by observing
created_aton retrieved rows (if that column is exposed). The pattern hides count but not time. Use Snowflake / ULID if you want time-encoded-in-ID for client-side sorting; NanoID specifically discards this.
Seen in¶
- sources/2026-04-21-planetscale-why-we-chose-nanoids-for-planetscales-api
— Mike Coutermarsh (PlanetScale, 2022-03-29) canonicalises the pattern with the full
CREATE TABLE userschema, thePublicIdGeneratorRails concern, and the Gopublicidpackage. "We still use standard auto-incrementing BigInts for our primary key. The public_id is only used as an external identifier." First-party disclosure that PlanetScale's external API IDs (deploy-request IDs, database IDs, branch IDs) are 12-char base-36 NanoIDs while the underlying rows useBIGINT AUTO_INCREMENTPKs. - sources/2026-04-21-planetscale-the-problem-with-using-a-uuid-primary-key-in-mysql — Brian Morrison II (PlanetScale, 2024-03-19) references this pattern indirectly when recommending UUID alternatives: "Snowflake IDs ... ULIDs ... NanoIDs (which we use at PlanetScale)" — the "which we use at PlanetScale" disclosure is for external IDs, not for clustered PKs, per the 2022 Coutermarsh post this defers to.
Related¶
- concepts/public-id-column
- concepts/nanoid-identifier
- concepts/uuid-primary-key-antipattern
- concepts/nanoid-collision-retry
- concepts/double-click-selectable-identifier
- concepts/clustered-index
- concepts/secondary-index
- concepts/innodb-page-fill-factor
- concepts/snowflake-id
- concepts/ulid-identifier
- patterns/sequential-primary-key
- systems/innodb
- systems/mysql
- systems/planetscale
- systems/ruby-on-rails