Skip to content

CONCEPT Cited by 1 source

FULLTEXT index

A FULLTEXT index is MySQL's native text-search index type: a specialised secondary-index shape for CHAR / VARCHAR / TEXT columns that supports word- level search rather than prefix / equality lookups. Unlike a regular B+tree secondary index, the underlying structure is an inverted index (word → list of rows containing that word) — the same data structure that powers dedicated search engines like Lucene and Elasticsearch.

Canonical framing

Justin Gage canonicalises MySQL's FULLTEXT as a native- engine primitive verbatim:

Using the FULLTEXT qualifier will create a text index in most popular relational databases. It can only be applied to text-type columns (CHAR, VARCHAR, or TEXT) in MySQL. Where things get interesting is using these indexes: MySQL provides a lot of functionality out of the box that starts to resemble what you'd expect from a modern text parsing / NLP library.

sources/2026-04-21-planetscale-how-do-database-indexes-work

The three search modes

MySQL's FULLTEXT index is queried via MATCH() AGAINST() with one of three mode specifiers:

Mode Query shape Behaviour
Natural language (default) MATCH(col) AGAINST('query') No special operators. Tokenises the query, applies built-in stopwords, ranks matches by relevance.
Boolean MATCH(col) AGAINST('query' IN BOOLEAN MODE) Uses a special query language — roughly analogous to regex but very different. Operators: +word (must), -word (must not), "phrase", word* (prefix), >word / <word (boost / de-boost).
Query expansion MATCH(col) AGAINST('query' WITH QUERY EXPANSION) Runs the query twice: a natural-language search, then another search using words from the returned rows — hence "expansion". Expensive; used when query terms may not match the corpus vocabulary directly.

Gage's summary quote:

Each of these methods has tradeoffs, none are one-size-fits-all.

Structural differences from a B+tree secondary index

A regular secondary index on a text column (CREATE INDEX title_idx ON posts(title)) stores whole column values in B+tree order. Good for:

  • Equality: WHERE title = 'exact'
  • Range / prefix: WHERE title LIKE 'pref%'

Bad for:

A FULLTEXT index instead tokenises the column into words at insert / update time and stores an inverted posting list per token. Querying via MATCH() AGAINST() looks up the query terms in the inverted index, unions / intersects the posting lists per mode, and returns rows with a relevance score. This is the same general shape as a Lucene index — MySQL's FULLTEXT is a simpler, MySQL-native variant.

Stopwords and customisation

MySQL's natural-language mode applies a default stopword list (common English words like the, and, is) that are stripped from both queries and indexed content. The stopword list is configurable via innodb_ft_server_stopword_table (server-wide) or innodb_ft_user_stopword_table (per-user). A minimum word length threshold (innodb_ft_min_token_size, default 3 characters) also skips very short terms.

The post explicitly flags customisability as a feature:

no special operators, uses the built-in stopwords that you can customize.

When to use FULLTEXT vs a dedicated search engine

FULLTEXT is convenient but limited:

  • Stays inside MySQL — no separate service, no replication pipeline, same transactional boundary as the rest of the data.
  • Relevance ranking is basic — TF-IDF-like scoring, no BM25 (MySQL 5.7+ offers a limited BM25-like mode under InnoDB), no learned rankers.
  • Language support is basic — English stemming only in the shipped configuration; multi-language workloads need external tokenisation.
  • Scale ceiling is lower — index updates on large FULLTEXT indexes are expensive; hot-writes serialised through the inverted-index maintenance.

The standard upgrade path for substantial text-search workloads is systems/elasticsearch / systems/opensearch / a dedicated Lucene-based service fed by CDC from the primary MySQL.

Syntax

-- Create
CREATE TABLE articles (
  id BIGINT PRIMARY KEY,
  title VARCHAR(200),
  body TEXT,
  FULLTEXT (title, body)
);

-- Query (natural language, default)
SELECT * FROM articles
WHERE MATCH(title, body) AGAINST('database indexes');

-- Query (boolean mode, required term + excluded term)
SELECT * FROM articles
WHERE MATCH(title, body)
  AGAINST('+database -mongodb' IN BOOLEAN MODE);

Seen in

  • sources/2026-04-21-planetscale-how-do-database-indexes-work — Justin Gage canonicalises FULLTEXT as a native-MySQL text-search primitive, enumerates the three search modes (natural language / boolean / query expansion) with the regex-analogy caveat for boolean mode and the "runs a regular natural language search, then another one using words from the returned rows" definition for query expansion. First wiki disclosure of the FULLTEXT primitive.
Last updated · 550 distilled / 1,221 read