SYSTEM Cited by 1 source
Moonlink¶
Moonlink is the named "real-time synchronization engine between operational and analytical formats, with zero ETL" exposed in the 2026-05-27 Databricks + Health Samurai co-marketing post on building a FHIR-native health data platform on Lakebase. Verbatim: "Data is replicated through Moonlink, a real-time synchronization engine between operational and analytical formats, with zero ETL. This allows FHIR data to flow seamlessly into the analytical layer, eliminating the dependencies for pipelines, transformation, or delays."
This is a first-disclosure name-only canonicalisation. The post names the primitive but does not disclose:
- Replication direction (one-way operational → analytical, bidirectional, or other).
- Replication mechanism (logical decoding, CDC over WAL, snapshot-and-tail, copy-on-write, other).
- Consistency model (strict, eventual, monotonic, snapshot isolation).
- Lag bounds (latency budget, freshness SLO, recovery semantics).
- Conflict-resolution semantics if bidirectional.
- Schema-evolution handling.
- Failure modes / partition-tolerance / split-brain behaviour.
- Relation to existing primitives — Lakebase Synced Tables (Delta → Postgres, three sync modes) and Lakehouse Sync (Postgres → Delta CDC) — so it is unclear whether Moonlink is a renaming, a unification, a third primitive, or a deeper engine underlying them.
Architectural role (per 2026-05-27)¶
In the canonicalised FHIR-server-on-lakehouse-substrate pattern (Aidbox-on-Lakebase), Moonlink owns the bridge between the operational format (Lakebase Postgres / Aidbox FHIR resources) and the analytical format (presumably Delta Lake under Unity Catalog) — without an explicit ETL pipeline.
The structural payoff named in the post:
- No replica. The same dataset is reachable through both access surfaces (FHIR API for operational; Spark / SQL / ML / AI/BI for analytical) — Moonlink avoids creating a second authoritative copy of clinical data.
- No ETL. Eliminates "the dependencies for pipelines, transformation, or delays" — the pipeline that conventionally connects a FHIR server to a warehouse simply doesn't exist.
- Real-time. Both access patterns see freshly-written data without batch-cycle delays.
These three properties together are what makes the dual-access pattern viable as a substrate property rather than as a per-customer integration project.
Seen in¶
- 2026-05-27 — sources/2026-05-27-databricks-building-a-fhir-native-health-data-platform-on-databricks-lakebase — first wiki naming; capability altitude only; positioned as the bridge between Aidbox's operational FHIR data on Lakebase and the Databricks analytical surface (Spark / SQL / ML / AI/BI).
Caveats¶
- Name-only disclosure. No mechanism, no internals, no scale, no positioning vs Synced Tables / Lakehouse Sync.
- Single source. No second-source confirmation; no customer with disclosed Moonlink-mediated production deployment scale.
- "Zero ETL" phrasing is a positioning claim. Real-world replication engines have failure modes, schema-evolution rules, and lag — none of these are disclosed. The "zero ETL" framing should be read as "no customer-managed ETL pipeline" rather than "no replication mechanism."
- Possible naming overlap. Without disclosed mechanism, it is unclear whether Moonlink is the brand name for a unification of Synced Tables + Lakehouse Sync, a deeper engine underlying both, or a parallel third primitive — to be clarified by future Databricks posts.