CONCEPT Cited by 1 source
Domain-knowledge redundant condition¶
A domain-knowledge redundant condition is a redundant-condition variant whose redundancy depends not on logic but on an application-level invariant the database cannot prove. The extra predicate could change the result set under the general semantics of SQL, but cannot in this application because something the code always maintains makes it a true superset of the real predicate.
Aaron Francis's canonical example:
If, given our understanding of the application, we can be sure that
created_atis always equal to or earlier thanupdated_at, we can use this to our advantage.— Aaron Francis, Using redundant conditions to unlock indexes in MySQL
(Source: sources/2026-04-21-planetscale-using-redundant-conditions-to-unlock-indexes-in-mysql.)
The canonical worked example¶
Schema has an index on created_at but none on
updated_at:
-- Slow: full table scan, no index on updated_at
SELECT * FROM todos
WHERE updated_at < '2023-01-01 00:00:00';
-- Fast: domain-knowledge redundant condition on created_at
SELECT * FROM todos
WHERE updated_at < '2023-01-01 00:00:00'
AND created_at < '2023-01-01 00:00:00';
The created_at predicate is not logically redundant —
in pure SQL semantics, there exist database states where a
row has updated_at < '2023-01-01' but created_at >=
'2023-01-01' (the row was inserted in 2024 with a
backdated updated_at). But by application
construction this state cannot arise: the application
always sets created_at on INSERT and never backdates
updated_at below created_at. So in this app
created_at ≤ updated_at universally, and the redundant
predicate is a true superset of the real predicate.
The planner uses the created_at index to narrow the row
set, and the updated_at predicate then filters out false
positives (rows where created_at < 2023-01-01 but
updated_at >= 2023-01-01 — a row created in 2022 and
updated in 2023, which the real predicate excludes).
Distinguishing from logical redundancy¶
| Logically redundant | Domain-knowledge redundant | |
|---|---|---|
| Proof | Predicate cannot exclude a row on any database state — provable from SQL semantics alone. | Predicate could exclude rows in principle — cannot in this app because of an invariant. |
| Example | id < 5 AND id < 10 (strict mathematical superset) |
updated_at < X AND created_at < X (where app always sets created_at ≤ updated_at) |
| Safety | Cannot go wrong — the redundant predicate's superset-ness is a theorem. | Goes wrong if the invariant is ever violated — the redundant predicate silently drops rows. |
| Refactor risk | None — purely local, query rewrite only. | Global — depends on every insert and update in the codebase maintaining the invariant forever. |
Francis puts it in the original post:
These are nice because they are easy to reason about and require no further domain knowledge. There are scenarios where you, as a human, might have more knowledge than the database does.
Failure modes¶
The invariant being correct right now is not enough — it must be correct forever under all writes the application or any adjacent system will ever perform. The typical failure modes the Francis post does not enumerate:
- Backfill from another system. Import script sets
updated_at = NOW()on every row butcreated_at = original.created_atfrom the source system. The invariantcreated_at ≤ updated_atstill holds. Conversely: import setscreated_at = NOW()butupdated_at = original.updated_at— if the original was updated in 2022 but created fresh today,updated_at < created_atand the invariant inverts. - GDPR-style forensic
UPDATE. Compliance team runsUPDATE users SET created_at = NULL WHERE deleted = trueto anonymise account-creation dates; the invariant now fails for the affected rows. - Clock skew. A multi-region write lands on a replica
with a clock 30s behind the primary's clock; the row's
updated_at(set by application code using local clock) could be before the row'screated_at(set by the primary'sDEFAULT CURRENT_TIMESTAMP). Rare but not impossible. - Manual admin update. Engineer runs an
UPDATE table SET updated_at = '2020-01-01' WHERE id = 42to suppress a timestamp for testing or data-repair; the invariant fails silently for that one row. - Triggered/cascaded writes. A trigger on another
table updates this table's
updated_atbut notcreated_at; schema change adds the trigger without anyone remembering the invariant. - Migration from a system that didn't maintain the
invariant. Previous SaaS exported data where
updated_at < created_atfor a subset of rows; new system preserves both values on import.
Each of these silently corrupts the query result. The symptoms are subtle — missing rows in a report, a time-series dashboard that shouldn't be monotone becoming non-monotone. The bug is nearly invisible in integration testing unless tests exercise the specific data shape.
When to use it¶
The safety margin is much narrower than for logical redundancy, and the reasoning must live somewhere:
- Document the invariant inline. Comment the redundant
condition with the exact invariant:
-- redundant: app maintains created_at <= updated_at for all rows. - Own the invariant at the schema level. A
CHECK (created_at <= updated_at)constraint (MySQL 8.0.16+) turns the invariant into a database-enforced guarantee, converting a domain-knowledge redundant condition into a logically-redundant one. - Prefer the alternative. If the column will be queried
repeatedly, add an index on it (
CREATE INDEX idx_updated ON todos (updated_at)) instead — one ALTER is a one-time cost, and the index guarantees correctness no matter what the application does. - Reserve for ad-hoc / one-time queries. The canonical application is the retrospective query that only runs once (backfill, reporting, audit) where adding an index would be wasteful. Under those conditions the domain-knowledge redundant condition is the lowest-cost option even considering its fragility.
Relationship to logical redundancy¶
A domain-knowledge redundant condition can graduate to logical by enforcing the invariant at a lower level:
- Add a
CHECKconstraint (MySQL 8.0.16+, Postgres always): the database now proves the invariant on every write. The query's redundant condition is now logically redundant by construction. - Add a
BEFORE UPDATEtrigger that rejects any update whereupdated_at < created_at: same effect via enforcement rather than validation. - Wrap all writes through an ORM callback / service layer that asserts the invariant: weaker but still significant — the invariant is now a single-enforcement-point property rather than a scattered assumption.
The converse — a logically-redundant condition degrading to domain-knowledge — cannot happen; logical redundancy is a property of SQL semantics, not of database state.
Seen in¶
- sources/2026-04-21-planetscale-using-redundant-conditions-to-unlock-indexes-in-mysql
— canonical introduction. Aaron Francis names the
distinction between logical and domain-knowledge
redundancy and walks the
created_at ≤ updated_atinvariant as the teaching example. The post's framing is optimistic ("you might have more knowledge than the database does") and does not enumerate the failure modes this wiki page canonicalises. - systems/mysql — the planner substrate.
- systems/innodb — where the
created_atindex lives.