CONCEPT Cited by 1 source
Logical data model¶
Definition¶
A logical data model is "a set of definitions of tables and columns in which the consolidated record pulled, matched, and merged from the different sources is stored" (sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition).
In an MDM context, the logical data model is the schema of the golden record. It is the primary deliverable of the design phase: every downstream MDM artifact (user interface, business processes, matching rules, data- storage layout) is shaped by it.
Why it matters¶
Because the logical data model drives so many downstream artifacts, errors in it are expensive to recover from late in the project. Zalando names this explicitly: "the logical data model is a main driver for the effort of creating a MDM tool effecting user interface, processes, business rules, and data storage, [and so] this risk might have a large impact and delays the business value delivery" (Source: sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition).
Authored directly vs. generated¶
Zalando identifies two approaches:
- Authored directly (traditional). Engineers write SQL DDL, Excel spreadsheets, or ER diagrams. Scales linearly in the number of source tables. Creates a communication gap with non-technical domain experts — "for business stakeholders that are domain experts the understanding of contents and how they relate to each other is hard to grasp from these technical definition files."
- Generated from a knowledge graph (Zalando's approach). Domain experts author the graph's Concept / Attribute / Relationship nodes + the column → concept mappings; a Python script generates the logical data model from the graph. Each concept becomes a table (columns = its attributes + ID), each relationship becomes a join table (patterns/mapping-driven-schema-generation).
Generated-model structure at Zalando¶
"Each concept is created with a table of its own, where the columns are all of its attributes and an internal identifier for the concepts. Each relationship also becomes a table of its own with the internal identifiers of the source and target concepts as foreign key columns." — Zalando MDM (sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition)
This is a relational realisation of the graph — suitable for a traditional SQL-backed MDM store. It mirrors the graph structure rather than denormalising it.
Seen in¶
- sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition — Zalando generates the logical data model from a Neo4j knowledge graph via a Python script.
Related¶
- concepts/master-data-management — the enclosing discipline
- concepts/golden-record — what the logical data model is the schema of
- concepts/transformation-data-model — the sibling deliverable (per-system source → logical-model mapping)
- concepts/semantic-layer-of-business-concepts — the middle layer Zalando uses to decouple source and logical schemas
- concepts/knowledge-graph — the substrate Zalando generates the logical model from
- systems/zalando-mdm-system — canonical wiki instance
- patterns/mapping-driven-schema-generation — the pattern