SYSTEM Cited by 1 source
Lakebase Synced Tables¶
Lakebase Synced Tables are managed copies of systems/unity-catalog Delta tables materialised inside systems/lakebase for OLTP-style, low-latency point lookup access. They are the Delta → Postgres half of Lakebase's bidirectional governed-data path; the other half is Lakehouse Sync (Postgres → Delta).
Verbatim from the 2026-05-20 marketing-campaigns post:
"Databricks Synced Tables create a managed copy of our Unity Catalog data in Lakebase, making it available to applications that need OLTP-style, low-latency queries."
The synced table appears to the application as a normal Postgres table. The synchronisation pipeline is managed — the customer doesn't write or maintain sync code. Configuration is "just a few clicks" in the Lakebase UI.
Sync modes¶
Three modes are exposed, with mode selection driven by the delta proportion of the upstream Delta table per sync cycle rather than by cadence:
| Mode | Cadence | Semantics | When to use |
|---|---|---|---|
| Snapshot | On-demand or scheduled | Replaces the entire Lakebase table from a Delta snapshot | When >10% of upstream data changes per cycle |
| Triggered | On-demand | Incremental upsert | When <10% of upstream data changes per cycle |
| Continuous | Streaming | Continuous incremental upsert | When latency is critical and changes are small/frequent |
The 10% / 10× rule of thumb¶
The load-bearing operational disclosure from the 2026-05-20 post:
"When more than 10% of the data is updated, we recommend snapshot mode, which delivers 10x better performance than triggered mode."
This is canonicalised as the patterns/snapshot-sync-mode-for-batch-rebuild pattern. The counterintuitive part: the snapshot replaces the entire table on every cycle, but for high-delta workloads it's still 10× faster than incremental upsert because:
- The incremental path pays a per-row diff/merge cost that scales linearly with delta size.
- The snapshot path is a bulk-copy, which Lakebase's storage- compute-separation backend can stream efficiently from Pageserver without pre-row conflict resolution.
When >10% of rows change, the bulk snapshot wins. When <10% change, the per-row delta dominates and triggered mode wins.
The post does not quantify the continuous-mode tradeoff or the exact crossover behaviour around the 10% threshold.
Canonical workload: marketing-campaign customer segments¶
The 2026-05-20 post pitches the canonical use case explicitly:
"Customer segments are recomputed nightly in batch, replacing a significant portion of the dataset. When more than 10% of the data is updated, we recommend snapshot mode."
The shape:
- Analytical pipeline in the Lakehouse computes customer segments nightly (large-scan, complex SQL, lakehouse-native work).
- Synced Table in snapshot mode materialises the segment table into Lakebase Postgres.
- Marketing platform (e.g. SAP Engagement Cloud) queries Lakebase as a normal Postgres database — point lookups by campaign trigger.
- Compute scales 0 → 16 CU during campaign bursts, back to 0 during the lows.
This is a clean separation of analytical (segment computation) and operational (segment lookup) concerns where the Synced Table is the boundary artifact. Without it, the customer would either:
- Run point lookups on the Lakehouse (slow, expensive, not optimised for high-concurrency point reads), or
- Build and maintain their own Lakehouse → OLTP sync pipelines per segment table (operational burden the post explicitly cites as the problem this avoids).
Architectural relationship to Lakebase¶
Synced Tables are read-only from the application's perspective. The Lakebase compute can read them via standard Postgres queries (with indexes, query plans, the works) but writes go through the Synced Tables sync layer, not through direct table writes. This is consistent with the concepts/htap separation of concerns: analytical workloads own the upstream Delta table; operational workloads read from the synced copy.
Bidirectional companion: Lakehouse Sync handles the other direction — operational data written into Lakebase Postgres (e.g. notification signups from applications) is continuously synchronised back to Unity Catalog Delta tables for analytics.
| Direction | Mechanism | Use case |
|---|---|---|
| Delta → Postgres | Synced Tables (3 modes) | Customer segments, lookup tables, AI features |
| Postgres → Delta | Lakehouse Sync (continuous CDC) | App-tier operational data (signups, events, state) |
Both pipelines are managed; both are governed by Unity Catalog; both eliminate the hand-written-sync-pipeline operational tax.
Constraints (disclosed)¶
- OLTP-shape only. "Databricks Lakebase is optimized for high-concurrency point lookups and short OLTP queries, not for large scans or classic OLAP." Synced Tables don't change this — large scans against synced tables should still happen on the upstream Delta table, not on the Lakebase copy.
- Not real-time for snapshot mode. Snapshot mode is bulk refresh; the freshness floor is the snapshot cadence (e.g. nightly).
Seen in¶
- Marketing campaigns at Deichmann (2026-05-20) — nightly segment recompute → snapshot-mode Synced Table → SAP Engagement Cloud campaign triggers. (Source: sources/2026-05-20-databricks-marketing-campaigns-with-lakebase)
Related¶
- systems/lakebase — host system; Synced Tables run inside Lakebase compute against Postgres-tier storage.
- systems/unity-catalog — upstream governance / source of Delta tables.
- systems/delta-lake — upstream storage format.
- systems/lakehouse-sync — bidirectional companion (Postgres → Delta).
- patterns/snapshot-sync-mode-for-batch-rebuild — the >10% rule of thumb.
- concepts/change-data-capture — generalisation; Lakehouse Sync is the CDC instance, Synced Tables snapshot mode is the bulk-rebuild alternative for high-delta cases.
- concepts/htap — Synced Tables are the analytical-to- operational materialisation layer.