Skip to content

CONCEPT Cited by 1 source

LTAP (Lake Transactional/Analytical Processing)

Definition

LTAP (Lake Transactional/Analytical Processing) is Databricks' architecture for running transactions and analytics against a single copy of data stored in open columnar formats on object storage. Unlike HTAP (which attempts a single engine for both workloads), LTAP keeps specialized engines for each workload type and unifies them at the storage layer.

How it works

  1. The Lakebase PageServer transcodes Postgres row-format data into Parquet columnar layout as it materializes pages to cloud object storage.
  2. Data lands in Delta Lake or Apache Iceberg table format, queryable by any Lakehouse engine.
  3. Postgres type system is preserved losslessly — exotic types use a structured overflow field.
  4. MVCC row versions are retained for Postgres snapshot isolation but are invisible to Iceberg/Delta readers (they see clean table snapshots).
  5. Analytical queries obtain a consistent read by asking Postgres for the current LSN, reading materialized data from object store, and merging only the recent delta from PageServer.

Key properties

  • No data movement: There is no CDC pipeline, no mirroring, no separate replication system.
  • Zero opt-in: Every table is available for analytics by construction.
  • Performance isolation: Transactional and analytical engines run on separate compute, sharing only storage.
  • Open formats: Data stored in Parquet with Delta/Iceberg metadata — no proprietary lock-in.
  • Columnar compression: >10× compression vs row format, reducing storage costs and network transfer.

Contrast with HTAP

Dimension HTAP LTAP
Engine Single engine, both workloads Separate specialized engines
Feature set Often incomplete Full (Postgres + Spark/Lakehouse)
Ecosystem New, limited Mature on both sides
Performance isolation Shared CPU/memory (contention) Fully isolated compute
Data copies One (in-engine) One (in storage layer)

Seen in

Last updated · 564 distilled / 1,671 read