Skip to content

CONCEPT Cited by 1 source

BLOB/TEXT index prefix requirement

MySQL requires that any B-tree index on a BLOB or TEXT column declare a prefix length — the number of leading bytes the index covers — because the full column value is variable-length and may be too large to fit in an index node. The server refuses to create a full BLOB/TEXT index and instead forces the schema to choose a prefix:

CREATE INDEX url_idx ON visits(url(255));  -- first 255 bytes only

This is a purely structural constraint of the B+tree: index entries must fit in a fixed node-size budget (concepts/b-plus-tree|16-KB InnoDB pages by default), and an unbounded variable-length value has no fixed budget.

(Source: sources/2026-04-21-planetscale-generated-hash-columns-in-mysql and the MySQL column-indexes reference.)

Consequence: strict equality on long values needs a different mechanism

A prefix index supports ORDER BY, LIKE 'prefix%', and range scans on the prefix, but it cannot support strict equality on the full value — two distinct values that share the same prefix land on the same index entry, so a WHERE url = '...' query has to re-check the full column for every prefix match. For long URLs, document hashes, or CASS-normalised addresses stored in TEXT, the prefix that would need to be indexed to preserve distinctness could easily exceed the whole value.

MySQL's own manual recommends the workaround explicitly:

Rather than testing for equality against a very long text string, you can store a hash of the column value in a separate column, index that column, and test the hashed value in queries. (Use the MD5() or CRC32() function to produce the hash value.) — MySQL 8 manual — Optimising BLOB

This is the canonical motivation for the generated hash column pattern — a BINARY(16) MD5 of the TEXT is both small enough to fit in an index node and narrow enough to make a B+tree shallow and a buffer-pool working set small.

  • InnoDB index-key length cap. Historically 767 bytes per index key (or 3,072 bytes with innodb_large_prefix + DYNAMIC/COMPRESSED row format); the modern MySQL 8 default is 3,072 bytes.
  • utf8mb4 character width. A 4-byte-per-character encoding turns a VARCHAR(N) into up to 4N bytes at the index altitude, compressing the maximum indexable prefix in character terms. See concepts/utf8mb4-vs-utf8.

Seen in

Last updated · 347 distilled / 1,201 read