CONCEPT Cited by 1 source
Optimizer index-expression rewrite¶
Optimizer index-expression rewrite is the query-planner
capability of recognising that a WHERE predicate
expression matches the expression underneath an indexed
generated column or a
functional index, and
substituting the indexed-column reference for the raw
expression so the index is actually used.
It's the load-bearing ergonomic property that makes indexed expressions viable in a real codebase: callers don't have to know the indexed column exists — they write the natural predicate, the optimizer rewrites it under the hood.
Why it matters¶
Without expression rewrite, an indexed derived value is leaky: every caller has to reference the generated column by name to hit the index. A query like
would table-scan even though the DBA has added a generated
email column + B-tree index. With expression rewrite, the
planner recognises that
properties->>"$.request.email" matches the generated
column's GENERATED ALWAYS AS (...) expression and rewrites
the query to
before execution. The raw-expression predicate now uses the index with no application change.
Worked example (MySQL)¶
From Aaron Francis's pedagogical PlanetScale post:
-- Schema
ALTER TABLE activity_log ADD COLUMN email VARCHAR(255)
GENERATED ALWAYS AS (properties->>"$.request.email");
ALTER TABLE activity_log ADD INDEX email(email) USING BTREE;
-- Query written against the raw JSON extraction:
EXPLAIN SELECT * FROM activity_log
WHERE properties->>"$.request.email" = 'little.bobby@tables.com';
EXPLAIN reports possible_keys: email, key: email,
key_len: 768 — the index is used. SHOW WARNINGS
immediately after reveals the rewrite:
Note 1003: /* select#1 */ select `activity_log`.`id` ...
from `activity_log` where
(`activity_log`.`email` = 'little.bobby@tables.com')
The optimizer substituted activity_log.email for the
properties->>"$.request.email" expression. The longhand
equivalent is rewritten too:
also hits the index, via the same rewrite (Source: sources/2026-04-21-planetscale-indexing-json-in-mysql).
Why third-party-ORM cases benefit most¶
Aaron Francis's canonical framing: "This is especially useful if you're unable to control the access pattern because the query is being issued from a 3rd party package in your codebase." A reporting tool, ORM, or query builder that emits the raw JSON-extraction predicate still gets the index — no need to fork the library or translate queries at an application layer.
Expression-matching constraints¶
The rewrite requires the planner's internal expression tree to match byte-for-byte after parsing. Practical consequences:
- Argument order matters.
JSON_EXTRACT(col, path)andJSON_EXTRACT(col, path)match; any variation in argument spelling (literal quoting, whitespace inside function calls) can prevent match. - Equivalent but distinct expressions don't match.
col + 1 - 2andcol - 1are arithmetically equal but parse to different trees. - Collation must match. See
concepts/functional-index-collation-mismatch for the
JSON-specific failure mode: a
CAST(... AS CHAR(255))withoutCOLLATE utf8mb4_binparses identically to one with it, but the collation annotation on the expression differs and the rewrite silently fails. - Escape hatch: query the named column directly. If the
rewrite is flaky, writing
WHERE email = '...'sidesteps it entirely.
A named generated column is strictly safer than a functional index for this reason — the expression lives once in the schema, and application queries can bind to the column name instead of restating the expression.
Generalises beyond JSON¶
The PlanetScale post surfaces this as a JSON-indexing footnote, but the property is substrate-general. Any planner that matches expression signatures against indexed expressions can do this:
- MySQL 8.x — generated columns, functional indexes.
- Postgres — expression indexes
(
CREATE INDEX idx ON t ((lower(email)));) with matchingWHERE lower(email) = '...'. Same optimizer behaviour. - SQL Server — persisted computed columns with
WITH SCHEMABINDING; the optimizer matches predicates that compute the same expression. - Oracle — function-based indexes; same property.
Worth canonicalising as a cross-engine concept because every expression-index design in every RDBMS depends on it to be useful in practice.
Seen in¶
- sources/2026-04-21-planetscale-indexing-json-in-mysql
— canonical wiki introduction via Aaron Francis's worked
example on JSON-extraction expressions. Demonstrates
rewrite both through
EXPLAIN(showskey: email) and throughSHOW WARNINGS(reveals the rewritten query text) for both->>-shorthand andJSON_UNQUOTE(JSON_EXTRACT(...))longhand forms.