Skip to content

CONCEPT Cited by 1 source

Join column type mismatch

A join column type mismatch occurs when the two columns on either side of a JOIN ... ON a.x = b.y predicate differ in type, size, or character encoding — and the difference is severe enough that MySQL's optimiser cannot use an index on the joined-to column. The join silently degrades to a full scan of the right-hand table per row of the left-hand table (or vice-versa depending on join order), often invisibly until query volume reveals the cost.

(Source: sources/2026-04-21-planetscale-why-isnt-mysql-using-my-index.)

The rule

If the columns are not of the same type and size, this will preclude using an index. [...] String columns must also use the same charset for an index to be used. If one column uses utf8mb4 and the other uses latin1, this will also preclude the use of an index.

— Aaron Francis, Why isn't MySQL using my index?

Two axes of mismatch matter:

  1. Type + sizeVARCHAR(10) and CHAR(10) are compatible (same length, same general family); VARCHAR(10) and CHAR(15) are not. INT and BIGINT UNSIGNED have edge cases where they still index but with a cast cost.
  2. Character setutf8mb4 and latin1 on joined string columns produces a per-row encoding conversion that prevents the planner from using the other side's index.

See the MySQL manual on Comparison of Indexed Columns for the exact rules on which cross-type combinations still index.

The lengthening fix

Francis's worked mitigation:

It may, in fact, be beneficial to lengthen the VARCHAR column to match the CHAR and allow the use of an index, even if the data won't be 15 characters long.

— Aaron Francis, Why isn't MySQL using my index?

The insight: VARCHAR(N) stores only the actual string length (plus a 1- or 2-byte length prefix), so lengthening VARCHAR(10) to VARCHAR(15) does not waste storage on existing rows. The schema change is cheap; the performance win is not.

Mechanism: the implicit cast

Under the hood, MySQL's expression evaluator inserts an implicit cast when types don't match — a.x = b.y becomes CAST(a.x AS ...) = b.y or vice-versa. This cast has the same effect as index obfuscation: the indexed column is wrapped in a function call from the planner's perspective, so the index on the wrapped side cannot be used for the lookup. The non-wrapped side's index can still be used as the driving index.

Charset mismatch — the hidden cost

String comparisons require both operands to share a character set and collation. MySQL silently converts otherwise — but conversion is a per-row cost that defeats index-based joins. The utf8 vs utf8mb4 migration era left many schemas with mixed charsets across tables, producing exactly this pitfall on long-after-migration JOINs.

Diagnostic:

SELECT table_name, column_name, character_set_name, collation_name
FROM information_schema.COLUMNS
WHERE table_schema = 'your_db'
  AND data_type IN ('char', 'varchar', 'text')
ORDER BY table_name, column_name;

Align character sets and collations before expecting cross-table joins to use indexes.

Best-practice: declare matching join columns identically

it's always best to declare columns that you plan to use in joins as the same size and type.

— Aaron Francis, Why isn't MySQL using my index?

Practical rules:

  • Same base type: both BIGINT UNSIGNED, or both CHAR(N), or both VARCHAR(N).
  • Matching length where type requires it: CHAR(10) + CHAR(10); VARCHAR(10) vs VARCHAR(15) usually fine (both variable-length, but prefer exact match).
  • Matching NOT NULL-ness — doesn't affect indexability but simplifies reasoning.
  • Matching charset + collation for strings: both utf8mb4 / utf8mb4_0900_ai_ci or whatever house standard you've chosen.

For numeric foreign-key columns: match the parent's primary-key type exactly. Signed vs unsigned matters (some cross-indexable cases, but edge-case failures produce the same mystery as charset mismatch).

Diagnostic via EXPLAIN

EXPLAIN SELECT * FROM a JOIN b ON a.x = b.y WHERE ...;
-- Look at the rightmost table's row:
--   possible_keys: index on b.y present?
--   key: is it actually used?
--   type: const/eq_ref/ref (good) vs ALL/index (bad)

If b.y has an index but EXPLAIN shows type: ALL on the b row with key: NULL, the mismatch is the first thing to check.

Seen in

Last updated · 470 distilled / 1,213 read