CONCEPT Cited by 1 source
Master data management¶
Definition¶
Master Data Management (MDM) is "a technology-enabled discipline in which business and Information Technology work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets" (sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition).
In practice, MDM addresses the problem where an enterprise has no central view of a specific subject matter — business partners, customers, products — because the relevant records are scattered across many systems, each with its own (differing or same) copy. MDM introduces a [[concepts/golden- record|golden record]]: a single, shared, trusted view per domain.
Implementation styles¶
MDM has three canonical implementation styles in the industry. Zalando names the one they chose:
- Consolidated style — ingest from source systems → run through match-and-merge → cleanse / quality-assure → store centrally in a canonical model → publish golden record back to source systems for correction. "At Zalando we are at an early phase of realising MDM for our internal data assets and we have chosen to do it in a consolidated style."
- Registry style (unnamed in Zalando's post) — a central MDM system stores only identifiers and pointers; records remain in source systems.
- Coexistence style (unnamed in Zalando's post) — golden record is stored centrally and written back into source systems as authoritative.
Core MDM deliverables¶
Any consolidated-style MDM project produces at least two schemas:
- Logical data model — the schema of the golden record: which entities exist, their attributes, their relationships.
- Transformation data model — for each source system, how each of its tables and columns maps (directly or indirectly) onto the logical model.
A transformation mapping that is direct = 1-to-1 column copy. Indirect = 1-to-many, requires a transformation algorithm (e.g. parsing unstructured address lines into structured components).
The manual-definition problem¶
Zalando names five drawbacks of the traditional MDM workflow where the logical data model is authored by hand (Source: sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition):
- Linear-in-table-count manual work — "the amount of manual work to create the logical data model increases relatively to the number of system tables."
- Domain knowledge is in the wrong place — "the data models are read and created by colleagues from engineering with limited business know-how."
- Communication artifacts are unreadable for business — SQL schemas and spreadsheets gatekeep understanding for non-technical domain experts.
- Business-engineering handoff is lossy — "the domain expert is limited from conveying correctly the knowledge to the engineers creating the data model, which leads to errors and misunderstandings."
- Risk amplification — the logical data model drives UI, processes, business rules, and storage; errors found late in development are expensive to unwind. "A MDM tool is released with a faulty and incorrect model that needs iterations of rework."
Zalando's response is to use a knowledge graph as the authoring substrate and auto-generate both schemas from it (patterns/knowledge-graph-for-mdm-modeling).
Seen in¶
- sources/2021-07-28-zalando-knowledge-graph-technologies-accelerate-and-improve-the-data-model-definition — Zalando's consolidated-style MDM design, early-phase; uses a knowledge graph to accelerate the data-model- definition phase.
Related¶
- concepts/golden-record — the MDM deliverable
- concepts/logical-data-model — the schema of the golden record
- concepts/transformation-data-model — the mapping from source schemas
- concepts/semantic-layer-of-business-concepts — the middle layer that decouples source schemas from golden- record schema
- concepts/knowledge-graph — the modeling substrate Zalando chose
- concepts/data-lineage — falls out of the graph approach for free
- systems/zalando-mdm-system — the canonical wiki instance
- patterns/knowledge-graph-for-mdm-modeling — the design pattern