Skip to content

CONCEPT Cited by 3 sources

JSON column as schema escape hatch

Definition

JSON column as schema escape hatch is the anti-pattern of using a MySQL JSON (or Postgres JSONB) column to absorb new fields as an alternative to running a schema migration. A developer who needs a new attribute on an existing entity adds a key to the row's JSON blob instead of adding a typed column to the table. No ALTER TABLE fires. The operational apparatus of schema change is sidestepped.

The appeal is operational: no migration means no metadata lock, no gh-ost configuration, no DBA review, no deploy-request queue. The cost is structural: the row now carries a schema inside itself, and that schema is invisible to the database.

Shlomi Noach (2021) names this as workaround #2 of the four developer schema workarounds in response to schema- change operational friction:

"Overloading schema-less JSON columns with more content." (Source: sources/2026-04-21-planetscale-the-promises-and-realities-of-the-relational-database-model)

Why it's an anti-pattern

Four costs compound over time:

(1) No type enforcement

A JSON value can hold a string where an integer was expected, null where a required field was expected, a nested object where a scalar was expected. The database happily stores all of them. Type errors surface at read time, often in the application code of a team that didn't write the original producer — one shape contract silently breaks and the reader is the one who finds out.

(2) No query-planner visibility

The MySQL optimiser has no statistics on values inside a JSON column. A query filtering on a JSON key (WHERE metadata->>'$.status' = 'active') has no selectivity estimate; the planner falls back to a table scan by default. Indexing a JSON key requires a generated column over the JSON_EXTRACT expression plus a functional index — a workflow substantially more operationally involved than simply declaring a typed column would have been. The escape hatch that was supposed to avoid operational work eventually requires more operational work than the migration it replaced.

(3) Shape divergence across rows

Because JSON is unvalidated, different writers at different code revisions write subtly different shapes. Row A has {status: "active", tier: "pro"}, row B has {status: "active", plan: "pro"} (different key name), row C has {state: "live"} (different schema entirely). Queries that filter on any of these keys miss the rows where the key is absent or spelled differently. Bug reports land months later: "the count is wrong."

(4) The future migration is worse

When business requirements eventually demand a typed column (for indexing, reporting, or cross-team contract reasons), the JSON key has to be promoted to a column. This requires:

  • ALTER TABLE to add the new column (the very migration the JSON escape hatch was avoiding).
  • A one-time data migration across every existing row to extract the JSON key and populate the new column. This was not needed at the moment the JSON key was added, because no historical row-rewrite was required back then.
  • Handling shape divergence from cost (3) — rows with the wrong key name need reconciliation; rows missing the key need a default value; rows with the wrong type need coercion or exception.

The net effect: the escape hatch added a row-rewrite migration that didn't originally exist. The pattern is genuinely self-defeating at the decade altitude.

The operational-friction trade

The reason the anti-pattern survives is that costs (1)–(4) are all deferred. The team that writes the JSON key gets the immediate benefit (avoid the migration). The team that later needs to index, filter, or promote the key pays the cost — often a different team, often years later.

Noach's reframe is to attack the cause: if schema migrations are cheap (deploy-request-style, minutes not hours, zero-downtime, revertible), there is no velocity advantage to JSON overloading, and the typed-column choice wins on merit every time. The anti-pattern is not "don't use JSON columns" — they have legitimate uses (MySQL JSON data type page enumerates them: log output, REST API response mirroring, configuration blobs, genuinely variable attributes). The anti-pattern is "use JSON columns to avoid running a migration you should have run."

When the JSON column is the right answer

Legitimate uses of MySQL JSON columns (per concepts/mysql-json-data-type):

  • Log / event payload — application or API log records where the shape is genuinely heterogeneous and query needs are either absent or aggregation-only.
  • Pass-through of a third-party response — store a vendor's JSON response verbatim so you can re-process it later if the contract expands.
  • Configuration blob — single-row config where no cross-row query pattern applies.
  • Genuinely variable attributes — per-customer custom fields, per-product-category attribute sets where the cardinality of shapes genuinely exceeds what a typed-column schema could express.

The diagnostic: "would I ever need to index, filter, or aggregate on a key inside this JSON?" If no, JSON is legitimate. If yes, it's the anti-pattern — use a typed column.

Composition with other concepts

Seen in

  • sources/2026-04-21-planetscale-the-promises-and-realities-of-the-relational-database-model — Shlomi Noach, PlanetScale, 2021-07-13. Canonical anti-pattern naming. Noach lists JSON column overloading as workaround #2 of four developer responses to schema-change operational friction. Frames it as a structurally-caused symptom of an operationally-poor schema-change interface, not as a developer-education failure. The prescription is to fix the cause (make schema change cheap) rather than to evangelise against the symptom (tell developers not to overload JSON).

  • sources/2026-04-21-planetscale-the-mysql-json-data-type — Mike Stojan, PlanetScale, 2022-09-23. Supplies the canonical framing of JSON as a schema-flexibility escape hatch verbatim: "The JSON data type is a bit of an anti-pattern to the rigourousness nature of such a schema. It allows you to break out of it, to gain flexibility when you need it." Stojan's post enumerates the legitimate uses (log / API response / config / variable attributes); this concept page canonicalises the misuse shape that Noach flagged a year earlier.

  • sources/2026-04-21-planetscale-indexing-json-in-mysql — Aaron Francis, PlanetScale. Canonicalises the recovery path when the anti-pattern has been committed: generated-column-plus-functional-index to get a B-tree on an extracted JSON key. The existence of this pattern confirms the anti-pattern cost: retroactively indexing a JSON key requires more operational machinery than typed-column-plus-index would have from the start.

Last updated · 470 distilled / 1,213 read