CONCEPT Cited by 1 source
Design Away Invalid States¶
Definition¶
Design away invalid states is the architectural principle that invalid states should be unrepresentable — at the protocol level, the data model level, and (when the language supports it) the type system level. The payoff isn't bug prevention per se — it's enabling invariants strong enough to make real bugs distinguishable from acceptable states, which is a prerequisite for meaningful property-based testing or formal verification.
Named and cited as a core Nucleus tenet in Dropbox's 2024 testing- strategy post:
One of our core architectural principles is "Design away invalid system states."
A follow-up post was promised (not yet captured) on how Rust's type system is leveraged to enforce this at compile time.
The canonical example (Nucleus)¶
In Sync Engine Classic, the
server-client protocol allowed the client to receive metadata for
a file /baz/cat before receiving metadata for its parent directory
/baz. Consequences, cascading:
- The local SQLite schema had to represent orphaned nodes (nodes without parent references), because that state was reachable on the wire.
- Every component processing filesystem metadata had to support the orphan case, because it could happen for real.
- Therefore, a real "orphaned file" bug (inconsistency accidentally created by a sync bug) looked identical to an acceptable transient state (metadata about to arrive).
- Which meant: no test could distinguish the two. The invariant "every node has a parent" was a meaningful correctness property you simply couldn't write down.
Nucleus's protocol rejects parentless-node metadata at the wire with a critical error — the client cannot enter a state where an orphan exists. The persisted data model then enforces "no node can exist, even transiently, without a parent directory" as a testable invariant. The tests became both possible and useful.
The two moves¶
The principle decomposes into two complementary design moves:
Shrink the representable-state set to exclude invalid states.
Either at the type-system level (enum with 3 variants, not
Option<Option<T>>; sum types for mutually-exclusive modes;
newtype wrappers that can only be constructed via validating
constructors) or at the data-model level (tables with CHECK
constraints, foreign keys with ON DELETE CASCADE, representations
keyed on invariants that can't be violated by construction).
Reject the invalid at the boundary, not after. The Nucleus protocol example: invalid states are rejected the instant they appear on the wire. Alternative anti-patterns: accept-then- validate (leaves a window where the state is internally invalid), store-then-clean-up (accumulates garbage you can't distinguish from real bugs), lenient-parser (guesses intent on invalid input and diverges from peer implementations).
Why this is a testability story, not just a code-quality story¶
The Dropbox post is unusual in framing this principle explicitly in testability terms: the reason to design away invalid states isn't aesthetic — it's that property tests, invariant assertions, and formal verification all require the invariant to actually be true. If the system admits orphaned nodes as a transient state, the invariant "there are no orphaned nodes" isn't a property you can check — and the same test that would catch orphan bugs also false-positives on legitimate operation.
This is the same motivation as concepts/lightweight-formal-verification (ShardStore's executable spec is simple enough to be specified because the impl's state space is bounded) and related to concepts/memory-safety (Rust's borrow checker eliminates a class of state — dangling pointers — so you never write tests against them).
Complementary: identity design¶
The Nucleus post shows a parallel move in node identity: Classic keyed nodes by path, which made renames transiently visible as "exists at two paths" — a state that looked like inconsistency, was inconsistency-in-the-data-model-you-chose, and accidentally encoded real bugs. Nucleus keys on unique IDs, making "a moved folder is visible in exactly one location" a load-bearing invariant. Same principle, different axis: shrink the representable-state set so the invariants you care about become inexpressible-to-violate.
Seen in¶
- sources/2024-05-31-dropbox-testing-sync-at-dropbox-2020 — explicitly names "design away invalid system states" as a core Nucleus tenet; two worked examples (protocol orphans, path- vs ID-keyed nodes).