CONCEPT Cited by 1 source

Data model mismatch (OLTP vs OLAP)¶

Data model mismatch names the structural observation that OLTP and OLAP workloads want different shapes of the same data, and that trying to serve both from a single physical schema forces "suboptimal design compromises" on both (Source: ).

OLTP wants:

Normalised schemas for update integrity — each fact stored once; referential constraints enforced.
Row-oriented storage for fast single-record reads/writes.
Fine-grained indexes on predicates used by high-QPS point queries.
Small per-operation footprint — a transaction touches a handful of rows.

OLAP wants:

Denormalised / star-schema / wide flat tables for scan-and-aggregate without JOIN fan-out.
Columnar storage for selective-I/O + per-column compression + vectorised execution.
Sort-orders and zone-maps / bloom filters for pruning whole row groups without touching payload bytes.
Large per-operation footprint — a single query scans millions to billions of rows.

The mismatch is structural: each optimisation subtracts from the other. A schema tuned for one workload is actively hostile to the other.

Canonical verbatim framing¶

From :

OLTP and OLAP workloads typically involve different data models. OLTP transactions focus on updating individual records and maintaining data integrity and consistency, while OLAP operations involve complex aggregations and scans. Trying to fit both types of workloads into the same data model can lead to suboptimal design compromises that hinder efficient processing for either workload.

This is challenge (3) of the eight structural HTAP challenges.

Where the mismatch bites¶

Storage layout. Row vs column — concepts/row-vs-column-major-layout. OLAP-friendly columnar layout is hostile to OLTP single-row reads (must touch every column file); OLTP-friendly row layout is hostile to OLAP scans (reads unused columns).
Normalisation level. OLTP 3NF schemas require JOINs that are fine for small lookups but devastating on billion-row scans; denormalised OLAP schemas have update anomalies that OLTP cannot tolerate.
Index design. OLTP benefits from many narrow indexes on hot predicates; OLAP benefits from zone-maps / bloom filters / min-max metadata at the row-group level. Running many OLTP indexes on an OLAP warehouse inflates ingest cost; running zone-maps on an OLTP engine consumes metadata space without accelerating point reads.
Update vs append. OLTP is update-in-place (with InnoDB/ MVCC or overwriting); OLAP is append + compact (LSM-ish, or re-partition-and-rewrite for columnar).

Canonical fix: two schemas, one pipeline¶

The PlanetScale-recommended architecture avoids the mismatch by keeping two distinct physical schemas: an OLTP schema in the source database (normalised, row-oriented, indexed) and an OLAP schema in the warehouse (denormalised, columnar, partitioned), connected by ETL/ELT or CDC that translates between them — see patterns/warehouse-unload-bridge.

Canva's MySQL → Snowflake + DBT migration (sources/2024-04-29-canva-scaling-to-count-billions) is a concrete example: the billion-row counting workload was a data-model mismatch disguised as a scalability problem — moving to a warehouse-native aggregation schema was the actual fix.

Seen in¶

— canonical wiki framing as challenge #3 of the eight structural HTAP challenges. "Trying to fit both types of workloads into the same data model can lead to suboptimal design compromises that hinder efficient processing for either workload."

concepts/oltp-vs-olap
concepts/htap
concepts/row-vs-column-major-layout
concepts/columnar-storage-format
concepts/mixed-workload-contention
concepts/elt-vs-etl
concepts/change-data-capture
patterns/warehouse-unload-bridge
concepts/offline-data-warehouse-as-translation-layer — Airbnb's framing of the warehouse as a translation layer absorbing OLTP–OLAP mismatch
concepts/separate-vs-monolithic-data-models — the downstream question: once you accept the mismatch, how do you structure the OLAP side for multiple products?

Data model mismatch (OLTP vs OLAP)¶

Canonical verbatim framing¶

Where the mismatch bites¶

Canonical fix: two schemas, one pipeline¶

Seen in¶

Related¶