Skip to content

CONCEPT Cited by 1 source

Domain-knowledge redundant condition

A domain-knowledge redundant condition is a redundant-condition variant whose redundancy depends not on logic but on an application-level invariant the database cannot prove. The extra predicate could change the result set under the general semantics of SQL, but cannot in this application because something the code always maintains makes it a true superset of the real predicate.

Aaron Francis's canonical example:

If, given our understanding of the application, we can be sure that created_at is always equal to or earlier than updated_at, we can use this to our advantage.

— Aaron Francis, Using redundant conditions to unlock indexes in MySQL

(Source: sources/2026-04-21-planetscale-using-redundant-conditions-to-unlock-indexes-in-mysql.)

The canonical worked example

Schema has an index on created_at but none on updated_at:

-- Slow: full table scan, no index on updated_at
SELECT * FROM todos
WHERE updated_at < '2023-01-01 00:00:00';

-- Fast: domain-knowledge redundant condition on created_at
SELECT * FROM todos
WHERE updated_at < '2023-01-01 00:00:00'
  AND created_at < '2023-01-01 00:00:00';

The created_at predicate is not logically redundant — in pure SQL semantics, there exist database states where a row has updated_at < '2023-01-01' but created_at >= '2023-01-01' (the row was inserted in 2024 with a backdated updated_at). But by application construction this state cannot arise: the application always sets created_at on INSERT and never backdates updated_at below created_at. So in this app created_at ≤ updated_at universally, and the redundant predicate is a true superset of the real predicate.

The planner uses the created_at index to narrow the row set, and the updated_at predicate then filters out false positives (rows where created_at < 2023-01-01 but updated_at >= 2023-01-01 — a row created in 2022 and updated in 2023, which the real predicate excludes).

Distinguishing from logical redundancy

Logically redundant Domain-knowledge redundant
Proof Predicate cannot exclude a row on any database state — provable from SQL semantics alone. Predicate could exclude rows in principle — cannot in this app because of an invariant.
Example id < 5 AND id < 10 (strict mathematical superset) updated_at < X AND created_at < X (where app always sets created_at ≤ updated_at)
Safety Cannot go wrong — the redundant predicate's superset-ness is a theorem. Goes wrong if the invariant is ever violated — the redundant predicate silently drops rows.
Refactor risk None — purely local, query rewrite only. Global — depends on every insert and update in the codebase maintaining the invariant forever.

Francis puts it in the original post:

These are nice because they are easy to reason about and require no further domain knowledge. There are scenarios where you, as a human, might have more knowledge than the database does.

Failure modes

The invariant being correct right now is not enough — it must be correct forever under all writes the application or any adjacent system will ever perform. The typical failure modes the Francis post does not enumerate:

  1. Backfill from another system. Import script sets updated_at = NOW() on every row but created_at = original.created_at from the source system. The invariant created_at ≤ updated_at still holds. Conversely: import sets created_at = NOW() but updated_at = original.updated_at — if the original was updated in 2022 but created fresh today, updated_at < created_at and the invariant inverts.
  2. GDPR-style forensic UPDATE. Compliance team runs UPDATE users SET created_at = NULL WHERE deleted = true to anonymise account-creation dates; the invariant now fails for the affected rows.
  3. Clock skew. A multi-region write lands on a replica with a clock 30s behind the primary's clock; the row's updated_at (set by application code using local clock) could be before the row's created_at (set by the primary's DEFAULT CURRENT_TIMESTAMP). Rare but not impossible.
  4. Manual admin update. Engineer runs an UPDATE table SET updated_at = '2020-01-01' WHERE id = 42 to suppress a timestamp for testing or data-repair; the invariant fails silently for that one row.
  5. Triggered/cascaded writes. A trigger on another table updates this table's updated_at but not created_at; schema change adds the trigger without anyone remembering the invariant.
  6. Migration from a system that didn't maintain the invariant. Previous SaaS exported data where updated_at < created_at for a subset of rows; new system preserves both values on import.

Each of these silently corrupts the query result. The symptoms are subtle — missing rows in a report, a time-series dashboard that shouldn't be monotone becoming non-monotone. The bug is nearly invisible in integration testing unless tests exercise the specific data shape.

When to use it

The safety margin is much narrower than for logical redundancy, and the reasoning must live somewhere:

  • Document the invariant inline. Comment the redundant condition with the exact invariant: -- redundant: app maintains created_at <= updated_at for all rows.
  • Own the invariant at the schema level. A CHECK (created_at <= updated_at) constraint (MySQL 8.0.16+) turns the invariant into a database-enforced guarantee, converting a domain-knowledge redundant condition into a logically-redundant one.
  • Prefer the alternative. If the column will be queried repeatedly, add an index on it (CREATE INDEX idx_updated ON todos (updated_at)) instead — one ALTER is a one-time cost, and the index guarantees correctness no matter what the application does.
  • Reserve for ad-hoc / one-time queries. The canonical application is the retrospective query that only runs once (backfill, reporting, audit) where adding an index would be wasteful. Under those conditions the domain-knowledge redundant condition is the lowest-cost option even considering its fragility.

Relationship to logical redundancy

A domain-knowledge redundant condition can graduate to logical by enforcing the invariant at a lower level:

  1. Add a CHECK constraint (MySQL 8.0.16+, Postgres always): the database now proves the invariant on every write. The query's redundant condition is now logically redundant by construction.
  2. Add a BEFORE UPDATE trigger that rejects any update where updated_at < created_at: same effect via enforcement rather than validation.
  3. Wrap all writes through an ORM callback / service layer that asserts the invariant: weaker but still significant — the invariant is now a single-enforcement-point property rather than a scattered assumption.

The converse — a logically-redundant condition degrading to domain-knowledge — cannot happen; logical redundancy is a property of SQL semantics, not of database state.

Seen in

Last updated · 470 distilled / 1,213 read