SYSTEM Cited by 1 source
Datomic¶
Datomic is a transactional database with an immutable fact model — every transaction appends new facts; existing facts are never overwritten. This produces a queryable history (time-travel queries) and a graph-native data model where relationships can be modeled as first-class facts. Originally created by Rich Hickey (Clojure) and Cognitect, now part of Nubank.
This page is a stub created for cross-referencing from sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph, where Datomic is the storage substrate for Netflix's MDS model lifecycle graph.
Core data model: facts¶
Datomic stores data as facts of the form [entity, attribute,
value, transaction, op]. New facts are appended; retraction is a
new fact stating the old fact no longer holds, but the original
fact remains in history. This produces:
- Time-travel queries — query the database as-of any past point.
- Audit trail by construction — every change is a fact with a transaction reference.
- Schema evolution without migration — adding new attributes is
just adding facts; no
ALTER TABLE.
Why Netflix MDS chose Datomic¶
"Datomic serves as both the system of record for MDS and the working dataset for enrichment processes. Its immutable fact model means we can continuously add relationships without losing the original entity state."
— sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph
The load-bearing property: enrichment jobs continuously append new edges as they walk multi-hop paths in the graph (concepts/multi-hop-relationship-materialization). On a mutable store this would require careful ordering of read-modify-write transactions to avoid losing concurrent writes. On Datomic's append-only fact model, every new edge is just a new fact; concurrent enrichment jobs can append independently without coordination.
What MDS stores in Datomic:
- All entity attributes as facts.
- Entity references (foreign keys, possibly to entities not yet fully resolved).
- All relationships as reified edges added by enrichment processes. See concepts/reified-edge-graph.
- Entity lifecycle state (uncached / partially-resolved / fully-enriched).
What this enables for MDS:
- Complex graph traversals — "Navigate from a model to its features to their data sources in a single query."
- Entity relationships — "Join across multiple domains without N+1 query problems."
- Flexible schema evolution — "Easy to add new entity types and attributes as the catalog grows."
- Progressive enrichment — "Background jobs efficiently identify and process entities requiring additional hydration, enabling gradual graph completion without reprocessing fully enriched entities."
Use shape: graph traversal¶
"In practice, we use Datomic for relationship-heavy, navigational queries such as: Starting from this model instance, show me all upstream datasets and downstream experiments. Given this feature, list all consuming models and their owning teams. These queries often span multiple hops in the graph and benefit from Datomic's immutable fact model and efficient joins across entity relationships."
— sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph
Datomic's query language is Datalog, which natively expresses recursive graph walks (transitive closure over a relationship edge). This is structurally a better fit for "walk the graph" than SQL recursive CTEs and avoids the N+1 query antipattern of an ORM-driven walk.
Seen in¶
- sources/2026-05-04-netflix-democratizing-machine-learning-building-the-model-lifecycle-graph — Netflix MDS uses Datomic as its system-of-record graph store.
Caveats¶
- This wiki page is a stub limited to the role Datomic plays in the Netflix MDS post. Datomic has many capabilities (time-travel queries, peer architecture, ION cloud deployment) not covered here.
- The post does not disclose Datomic deployment shape (Pro? Cloud? custom-hosted?), cluster topology, or capacity numbers.