CONCEPT

FULLTEXT index¶

A FULLTEXT index is MySQL's native text-search index type: a specialised secondary-index shape for CHAR / VARCHAR / TEXT columns that supports word- level search rather than prefix / equality lookups. Unlike a regular B+tree secondary index, the underlying structure is an inverted index (word → list of rows containing that word) — the same data structure that powers dedicated search engines like Lucene and Elasticsearch.

Canonical framing¶

Justin Gage canonicalises MySQL's FULLTEXT as a native- engine primitive verbatim:

Using the FULLTEXT qualifier will create a text index in most popular relational databases. It can only be applied to text-type columns (CHAR, VARCHAR, or TEXT) in MySQL. Where things get interesting is using these indexes: MySQL provides a lot of functionality out of the box that starts to resemble what you'd expect from a modern text parsing / NLP library.

—

The three search modes¶

MySQL's FULLTEXT index is queried via MATCH() AGAINST() with one of three mode specifiers:

Mode	Query shape	Behaviour
Natural language (default)	`MATCH(col) AGAINST('query')`	No special operators. Tokenises the query, applies built-in stopwords, ranks matches by relevance.
Boolean	`MATCH(col) AGAINST('query' IN BOOLEAN MODE)`	Uses a special query language — roughly analogous to regex but very different. Operators: `+word` (must), `-word` (must not), `"phrase"`, `word*` (prefix), `>word` / `<word` (boost / de-boost).
Query expansion	`MATCH(col) AGAINST('query' WITH QUERY EXPANSION)`	Runs the query twice: a natural-language search, then another search using words from the returned rows — hence "expansion". Expensive; used when query terms may not match the corpus vocabulary directly.

Gage's summary quote:

Each of these methods has tradeoffs, none are one-size-fits-all.

Structural differences from a B+tree secondary index¶

A regular secondary index on a text column (CREATE INDEX title_idx ON posts(title)) stores whole column values in B+tree order. Good for:

Equality: WHERE title = 'exact'
Range / prefix: WHERE title LIKE 'pref%'

Bad for:

Word-level search: WHERE title LIKE '%word%' → full scan (non-sargable; see concepts/wildcard-prefix-non-sargable)
Multi-term relevance ranking
Phrase / proximity queries

A FULLTEXT index instead tokenises the column into words at insert / update time and stores an inverted posting list per token. Querying via MATCH() AGAINST() looks up the query terms in the inverted index, unions / intersects the posting lists per mode, and returns rows with a relevance score. This is the same general shape as a Lucene index — MySQL's FULLTEXT is a simpler, MySQL-native variant.

Stopwords and customisation¶

MySQL's natural-language mode applies a default stopword list (common English words like the, and, is) that are stripped from both queries and indexed content. The stopword list is configurable via innodb_ft_server_stopword_table (server-wide) or innodb_ft_user_stopword_table (per-user). A minimum word length threshold (innodb_ft_min_token_size, default 3 characters) also skips very short terms.

The post explicitly flags customisability as a feature:

no special operators, uses the built-in stopwords that you can customize.

When to use `FULLTEXT` vs a dedicated search engine¶

FULLTEXT is convenient but limited:

Stays inside MySQL — no separate service, no replication pipeline, same transactional boundary as the rest of the data.
Relevance ranking is basic — TF-IDF-like scoring, no BM25 (MySQL 5.7+ offers a limited BM25-like mode under InnoDB), no learned rankers.
Language support is basic — English stemming only in the shipped configuration; multi-language workloads need external tokenisation.
Scale ceiling is lower — index updates on large FULLTEXT indexes are expensive; hot-writes serialised through the inverted-index maintenance.

The standard upgrade path for substantial text-search workloads is systems/elasticsearch / systems/opensearch / a dedicated Lucene-based service fed by CDC from the primary MySQL.

Syntax¶

-- Create
CREATE TABLE articles (
  id BIGINT PRIMARY KEY,
  title VARCHAR(200),
  body TEXT,
  FULLTEXT (title, body)
);

-- Query (natural language, default)
SELECT * FROM articles
WHERE MATCH(title, body) AGAINST('database indexes');

-- Query (boolean mode, required term + excluded term)
SELECT * FROM articles
WHERE MATCH(title, body)
  AGAINST('+database -mongodb' IN BOOLEAN MODE);

Seen in¶

— Justin Gage canonicalises FULLTEXT as a native-MySQL text-search primitive, enumerates the three search modes (natural language / boolean / query expansion) with the regex-analogy caveat for boolean mode and the "runs a regular natural language search, then another one using words from the returned rows" definition for query expansion. First wiki disclosure of the FULLTEXT primitive.

concepts/secondary-index — the default index shape; FULLTEXT is a specialised variant
concepts/wildcard-prefix-non-sargable — why LIKE '%word%' can't use a B+tree index and needs a FULLTEXT alternative
concepts/mysql-explain — type: fulltext is the EXPLAIN access-type for FULLTEXT joins
systems/mysql
systems/innodb
systems/elasticsearch — the canonical external upgrade path
systems/lucene — shared inverted-index lineage