Skip to content

PATTERN Cited by 1 source

In-workspace app as decision support

Pattern: deploy a decision-support web application inside the data-platform workspace boundary rather than as a separate web tier that synchronises to the warehouse. The app authenticates as a workspace service principal, queries the analytical store directly via the SQL Statement API, calls adjacent platform services (NL query, ML serving) over internal REST APIs, and uses the workspace's managed operational database for app-tier state.

Canonical wiki instance: sources/2026-05-13-databricks-clinical-operations-intelligence-belongs-on-the-lakehouse on systems/databricks-apps"The app authenticates as a first-class workspace service principal, queries Unity Catalog via the SQL Statement API, and calls AI/BI Genie over the workspace REST API — all on internal connections. Clinical operations data never crosses a workspace boundary."

What the pattern eliminates

Compared to the conventional web-app-tier shape (separate Kubernetes cluster + JDBC connections to the warehouse + sync to a separate RDS + separate IAM + separate secrets manager), the in-workspace shape eliminates:

  • The analytical→operational sync pipeline. The app reads UC directly; there is no separate operational-DB schema to keep synchronized.
  • The separately rotated credential surface. App auth is the workspace service principal; UC and Lakebase access are inherited through the workspace identity system.
  • The analytical-vs-operational semantic-harmonization layer. UC's schema is the application's schema; no Silver-layer reconciliation between the warehouse model and the application model.
  • The external API surface for cross-tier composition. "Calls AI/BI Genie over the workspace REST API — all on internal connections" — Genie is composed without creating a public endpoint.

Implementation shape

Layer Conventional shape In-workspace shape
Web tier identity Per-app IAM role + secrets manager Workspace service principal
Analytical data path JDBC connection pool with rotated credentials SQL Statement API over internal HTTP
NL query path Public Genie API + per-user OAuth Workspace REST API over internal connections
Operational state Separately managed RDS + sync pipeline systems/lakebase (managed Postgres, scale-to-zero, workspace-credentialed)
RBAC App-side role mapping translated from UC policies Inherited from UC access controls automatically
Network posture App-tier egress + NAT + private link to warehouse No external network surface — all internal
Deployment CI/CD pipeline → Kubernetes → service mesh Deploy into the workspace; ~30 min for the canonical reference impl

When the pattern applies

The 2026-05-13 source pitches this for regulated decision-support applications where the audit-chain integrity matters more than multi-platform portability:

  • Clinical-trial site selection (systems/site-feasibility-workbench, the reference implementation), under FDORA 2022 + 21 CFR Part 11 + ICH E6(R3) + FDA GMLP.
  • Patient cohort and recruitment (named as a roadmap module of the broader Clinical Operations Intelligence Hub).
  • Enrollment velocity optimization with ML stall prediction.
  • Risk-based monitoring and compliance with continuous anomaly detection.

The general criteria for "this is the right shape":

  • The app's data is already in a governed analytical substrate. If the data is elsewhere, the pattern degrades to ingest-pipeline-plus-app-tier.
  • The audit chain matters more than platform-portability. The pattern trades coupling-to-the-platform for chain-of-custody from decision back to training data, in one governance system.
  • The app is read-mostly with modest operational state. Lakebase scale-to-zero is great for bursty app-tier state (saved shortlists, user preferences, session data); not ideal for high-throughput OLTP.
  • PHI / PII / regulatory data handling is configured at the catalog level. "PHI handling follows the sponsor's HIPAA Safe Harbor / Expert Determination posture configured at the catalog or schema level" — the app inherits this for free.

When the pattern doesn't fit

  • Multi-region / multi-cloud apps that must remain platform-portable. The in-workspace shape couples the app to the data-platform workspace boundary by design.
  • High-throughput consumer-facing OLTP. Lakebase is great for app state in bursty decision-support workloads; sustained kHz+ OLTP belongs on a dedicated OLTP substrate.
  • Apps that primarily serve external (non-workspace) users. The workspace-resident model assumes the app's user identities map onto workspace-side identity infrastructure.
  • Apps where the data is not in the data platform. If the data lives in transactional source-of-truth systems (CTMS / EDC / IRT in the clinical-trials case), the in-workspace app is a decision-support layer consuming Lakehouse-side analytical copies — "This is a decision-support layer, not a source-of-record system. The CTMS/EDC/IRT remain authoritative."

Composes with

Trade-offs

Axis Cost Benefit
Platform coupling App is now coupled to the data platform's runtime + identity + REST API surface. UC governance, lineage, audit, and access controls compose for free.
Portability Re-platforming the app means re-platforming the entire stack. The single-platform composition eliminates four integration layers (sync pipeline + credential surface + RBAC translation + semantic harmonisation).
Runtime expressiveness App is constrained to whatever the platform's app runtime supports (FastAPI / React in the canonical reference). Deployment time drops to "approximately 30 minutes of technical deployment time" for the reference implementation.
Audit chain Audit chain is constrained to artifacts the platform governs. Audit chain is unbroken end-to-end inside one governance system.

Operational disclosure (so far)

From the canonical source:

  • Deployment time: "approximately 30 minutes of technical deployment time, before sponsor-specific security review and validation" for the Site Feasibility Workbench.
  • Stack: FastAPI backend, React frontend, deploys "into an existing Databricks workspace with Unity Catalog."
  • Network posture: "makes no external API calls, maintains no separate operational database infrastructure, and requires no synchronization pipeline between the analytical and operational layers."

Latency / throughput / cold-start / scaling envelopes are not yet disclosed in the wiki sources for this pattern. Reserved for future ingest passes.

Seen in

Last updated · 542 distilled / 1,571 read