CONCEPT Cited by 1 source
FULLTEXT index¶
A FULLTEXT index is MySQL's native
text-search index type: a specialised secondary-index shape
for CHAR / VARCHAR / TEXT columns that supports word-
level search rather than prefix / equality lookups. Unlike
a regular B+tree secondary
index, the underlying structure is an inverted index
(word → list of rows containing that word) — the same data
structure that powers dedicated search engines like
Lucene and
Elasticsearch.
Canonical framing¶
Justin Gage canonicalises MySQL's FULLTEXT as a native-
engine primitive verbatim:
Using the
FULLTEXTqualifier will create a text index in most popular relational databases. It can only be applied to text-type columns (CHAR,VARCHAR, orTEXT) in MySQL. Where things get interesting is using these indexes: MySQL provides a lot of functionality out of the box that starts to resemble what you'd expect from a modern text parsing / NLP library.— sources/2026-04-21-planetscale-how-do-database-indexes-work
The three search modes¶
MySQL's FULLTEXT index is queried via MATCH() AGAINST()
with one of three mode specifiers:
| Mode | Query shape | Behaviour |
|---|---|---|
| Natural language (default) | MATCH(col) AGAINST('query') |
No special operators. Tokenises the query, applies built-in stopwords, ranks matches by relevance. |
| Boolean | MATCH(col) AGAINST('query' IN BOOLEAN MODE) |
Uses a special query language — roughly analogous to regex but very different. Operators: +word (must), -word (must not), "phrase", word* (prefix), >word / <word (boost / de-boost). |
| Query expansion | MATCH(col) AGAINST('query' WITH QUERY EXPANSION) |
Runs the query twice: a natural-language search, then another search using words from the returned rows — hence "expansion". Expensive; used when query terms may not match the corpus vocabulary directly. |
Gage's summary quote:
Each of these methods has tradeoffs, none are one-size-fits-all.
Structural differences from a B+tree secondary index¶
A regular secondary index on a text column (CREATE INDEX
title_idx ON posts(title)) stores whole column values in
B+tree order. Good for:
- Equality:
WHERE title = 'exact' - Range / prefix:
WHERE title LIKE 'pref%'
Bad for:
- Word-level search:
WHERE title LIKE '%word%'→ full scan (non-sargable; see concepts/wildcard-prefix-non-sargable) - Multi-term relevance ranking
- Phrase / proximity queries
A FULLTEXT index instead tokenises the column into words
at insert / update time and stores an inverted posting
list per token. Querying via MATCH() AGAINST() looks up
the query terms in the inverted index, unions / intersects
the posting lists per mode, and returns rows with a
relevance score. This is the same general shape as a
Lucene index — MySQL's FULLTEXT is a
simpler, MySQL-native variant.
Stopwords and customisation¶
MySQL's natural-language mode applies a default stopword
list (common English words like the, and, is) that
are stripped from both queries and indexed content. The
stopword list is configurable via innodb_ft_server_stopword_table
(server-wide) or innodb_ft_user_stopword_table (per-user).
A minimum word length threshold (innodb_ft_min_token_size,
default 3 characters) also skips very short terms.
The post explicitly flags customisability as a feature:
no special operators, uses the built-in stopwords that you can customize.
When to use FULLTEXT vs a dedicated search engine¶
FULLTEXT is convenient but limited:
- Stays inside MySQL — no separate service, no replication pipeline, same transactional boundary as the rest of the data.
- Relevance ranking is basic — TF-IDF-like scoring, no BM25 (MySQL 5.7+ offers a limited BM25-like mode under InnoDB), no learned rankers.
- Language support is basic — English stemming only in the shipped configuration; multi-language workloads need external tokenisation.
- Scale ceiling is lower — index updates on large
FULLTEXTindexes are expensive; hot-writes serialised through the inverted-index maintenance.
The standard upgrade path for substantial text-search workloads is systems/elasticsearch / systems/opensearch / a dedicated Lucene-based service fed by CDC from the primary MySQL.
Syntax¶
-- Create
CREATE TABLE articles (
id BIGINT PRIMARY KEY,
title VARCHAR(200),
body TEXT,
FULLTEXT (title, body)
);
-- Query (natural language, default)
SELECT * FROM articles
WHERE MATCH(title, body) AGAINST('database indexes');
-- Query (boolean mode, required term + excluded term)
SELECT * FROM articles
WHERE MATCH(title, body)
AGAINST('+database -mongodb' IN BOOLEAN MODE);
Seen in¶
- sources/2026-04-21-planetscale-how-do-database-indexes-work
— Justin Gage canonicalises
FULLTEXTas a native-MySQL text-search primitive, enumerates the three search modes (natural language / boolean / query expansion) with the regex-analogy caveat for boolean mode and the "runs a regular natural language search, then another one using words from the returned rows" definition for query expansion. First wiki disclosure of theFULLTEXTprimitive.
Related¶
- concepts/secondary-index — the default index shape;
FULLTEXTis a specialised variant - concepts/wildcard-prefix-non-sargable — why
LIKE '%word%'can't use a B+tree index and needs aFULLTEXTalternative - concepts/mysql-explain —
type: fulltextis theEXPLAINaccess-type forFULLTEXTjoins - systems/mysql
- systems/innodb
- systems/elasticsearch — the canonical external upgrade path
- systems/lucene — shared inverted-index lineage