Skip to content

CONCEPT Cited by 1 source

Generated column (MySQL)

A generated column is a column whose value is the result of a deterministic expression over other columns in the same row — not a value inserted by the application. The database engine, not the application, is responsible for keeping the value in sync with its inputs, so it can never fall out of sync with the data it is derived from.

The expression may reference literals, built-in functions (e.g. PI(), POW, MD5, CONCAT_WS), or other columns in the same row. The result must be scalar and deterministic — non-deterministic functions like NOW() or RAND() are not allowed.

(Source: sources/2026-04-21-planetscale-generated-hash-columns-in-mysql.)

Syntax

col_name data_type [GENERATED ALWAYS] AS (expr)
  [VIRTUAL | STORED] [NOT NULL | NULL]
  [UNIQUE [KEY]] [[PRIMARY] KEY]
  [COMMENT 'string']

Aaron Francis's worked example on a circles table:

CREATE TABLE circles (
  diameter DOUBLE,
  radius DOUBLE AS (diameter / 2),        -- generated
  area   DOUBLE AS (PI() * POW(radius, 2))  -- generated, references another generated column
);

Inserting only diameter populates all three columns; SELECT * returns the computed radius and area with no additional application-side logic.

VIRTUAL vs STORED

Property VIRTUAL (default) STORED
On-disk bytes Zero in the row Full column stored
Read cost Re-compute each access Direct read
Write cost None on INSERT/UPDATE Recomputed + written on change
Indexable Yes (materialised in index leaves) Yes
Appropriate when Expression is cheap Expression is expensive

Aaron Francis's rule of thumb: "if it's expensive to calculate the value, store it." For cheap arithmetic on already-read columns (diameter / 2), VIRTUAL is fine — the expression is evaluated on read but that's cheaper than the extra I/O a STORED column would cost. For expensive transforms (hashing a long TEXT column on every row access), STORED amortises the cost to write time.

Both kinds can be indexed. A VIRTUAL column's index stores the computed value in the index leaves even though the row itself doesn't — so indexed lookups via the generated column work identically to indexed lookups via a stored column.

Always in sync

Because the expression is the column's only source of data, the generated column can never diverge from the columns it depends on. There is no application code-path that could forget to update it; there is no backfill race during schema change; there is no dual-write consistency problem. This is the load-bearing property that makes generated hash columns a reliable substitute for hand-maintained denormalised columns.

Canonical application: generated hash columns

The most common production use of generated columns in MySQL is to materialise a compact hash of a wide or long column, index the hash, and use the index for fast equality lookups on values too large to index directly. See concepts/generated-hash-column + patterns/generated-hash-column-for-equality-lookup.

Relationship to functional indexes

MySQL 8.0.13 introduced functional indexes — indexes on an expression directly, without naming an intermediate generated column. Internally, functional indexes are implemented as hidden virtual generated columns with an index on them, so the on-disk shape is identical; the difference is purely syntactic.

Seen in

  • sources/2026-04-21-planetscale-generated-hash-columns-in-mysql — canonical wiki introduction of MySQL generated columns via Aaron Francis's pedagogical post. Covers syntax, VIRTUAL vs STORED trade-off, and the expensive-to- compute rule of thumb. The post's main application is hashing but the generated-column primitive generalises beyond hashing to any deterministic derived value.
  • systems/mysql — the substrate.
  • systems/innodb — storage engine where indexed generated columns materialise into secondary-index B+tree leaves.
Last updated · 347 distilled / 1,201 read