SYSTEM Cited by 4 sources
Databricks Apps¶
Databricks Apps is Databricks' deployment model for running web applications inside the workspace boundary. The application is hosted by the platform itself, authenticates as a first-class workspace service principal, queries Unity Catalog over the SQL Statement API, and can call adjacent platform services (e.g. AI/BI Genie over the workspace REST API) — all on internal connections.
Stub page. First wiki source surfaces Apps as the deployment model for a clinical-trial site-selection workbench (sources/2026-05-13-databricks-clinical-operations-intelligence-belongs-on-the-lakehouse); the Site Feasibility Workbench reference implementation is FastAPI backend + React frontend, deploys into a workspace with Unity Catalog in "approximately 30 minutes of technical deployment time."
Three structural properties¶
From the 2026-05-13 source:
-
Service-principal authentication. "The app authenticates as a first-class workspace service principal" — the app's identity is a workspace primitive, not an external OAuth surface or per-user credential. Auth is provisioned and rotated by the workspace identity system, not a separately managed secrets store.
-
SQL Statement API as the path to UC tables. "Queries Unity Catalog via the SQL Statement API" — not a JDBC driver shipping credentials around to long-lived connection pools. The app's queries hit UC over an internal HTTP path that inherits the app's service-principal identity end-to-end.
-
REST API to adjacent platform services. "Calls AI/BI Genie over the workspace REST API — all on internal connections" — Genie is composed into the app workflow as an embedded NL-query layer, not a separate Genie-room product. The internal-connection property means "clinical operations data never crosses a workspace boundary."
What this composition unlocks¶
The 2026-05-13 source frames Databricks Apps as one leg of a single-platform application architecture that eliminates three traditional integration layers:
- No analytical↔operational sync pipeline. UC is queried directly by the app via the SQL Statement API; operational state lives in systems/lakebase in the same workspace. There is no ETL between "the analytical warehouse" and "the application database."
- No separately managed credential surface. "Lakebase is the operational database layer — managed PostgreSQL that scales to zero when idle, provisioned and credentialed entirely within the workspace identity system." The app's service principal grants it access to both UC and Lakebase without a separate secrets store.
- No external API surface. "The app inherits Unity Catalog access controls without any additional configuration." UC's row-filter / column-mask / ABAC machinery composes onto the app for free — per-user PHI handling rides on the catalog's HIPAA Safe Harbor / Expert Determination posture configured at the catalog or schema level.
The post's unifying claim: "a clinical operations application that makes no external API calls, maintains no separate operational database infrastructure, and requires no synchronization pipeline between the analytical and operational layers."
Reference implementation¶
The Site Feasibility Workbench is the first public open-source Databricks App. Stack:
- Backend: FastAPI (Python).
- Frontend: React.
- Data path: SQL Statement API → Unity Catalog (governed Delta tables for site features, predictions, SHAP attributions, audit log).
- NL query path: Workspace REST API → AI/BI Genie.
- App-state path: Lakebase (Postgres) for saved shortlists + team-sharing state.
- Deployment time: "approximately 30 minutes of technical deployment time, before sponsor-specific security review and validation."
The post names three additional Databricks Apps as the rest of the Clinical Operations Intelligence Hub roadmap (Patient Cohort and Recruitment, Enrollment Velocity Optimizer, Risk-Based Monitoring and Compliance) — "All four deploy as Databricks Apps. All four query Unity Catalog directly. None make external API calls."
Why this matters for system design¶
Databricks Apps is the wiki's first canonical example of a workspace-resident application runtime in a data-platform context — the application tier composed into the same identity / governance / audit substrate as the analytical and operational tiers. The architectural counterpoint is the conventional decoupled-tier shape (separate Kubernetes cluster running the web app, separate IAM, JDBC connection pool to the warehouse, sync pipeline to a separate RDS operational DB, separate secrets manager rotating credentials between all four). Each layer in the conventional shape is one more place that "introduces integration overhead, credential surface area, and a synchronization lag that erodes trust in the data the application shows."
For regulated decision-support apps (clinical trials under FDORA 2022 + 21 CFR Part 11 + ICH E6(R3) + FDA GMLP), the workspace-resident shape unlocks a clean ML-audit substrate: predictions and their SHAP attributions are written to governed Delta tables that the app reads from the same workspace, so "the rationale behind a site selection is as auditable as the score itself" via SQL queries against UC. See patterns/shap-attribution-as-governed-delta-table.
Open architectural questions¶
(For future ingest passes when more Databricks Apps internals disclose.)
- What's the runtime substrate? (Kubernetes pods inside the workspace control plane? Serverless functions? Customer-isolated vs shared?)
- What's the cold-start / scaling envelope? (Per-app scale-to-zero? Per-user session state? Concurrency limits?)
- How is the app's network posture enforced? (Egress filtering? VPC attachment? Private-link to UC / Lakebase / Genie internal endpoints?)
- What's the multi-tenancy model? (One app per workspace? Per workspace user? Per service principal?)
- How does Apps compose with Lakebase scale-to-zero? (Does the app warm Lakebase on first request? Connection pooling at the app or at Lakebase Manager?)
- What's the app-deployment unit? (Asset Bundle? Databricks Asset Bundle declaration?)
Seen in¶
-
sources/2026-05-19-databricks-deutsche-borse-zeppelin-to-databricks-notebook-migration — Fourth Databricks Apps face: customer-built migration-tool substrate. Apps as the deployment platform for a Deutsche Börse-built notebook- migration utility ( Zeppelin to Databricks Notebook Converter) — distinct from the three previously-canonicalised Apps faces (clinical-ops decision- support / Claroty HITL-UI / DBA-automation + AI-agent-DB-access). The migration-utility face is "the App is itself a migration tool, not the app being a destination workload": the converter runs inside the destination platform's workspace and converts artifacts into the platform's own format. Stack disclosed: shadcn UI frontend (production; evolved from a Streamlit prototype — the second wiki disclosure of Streamlit on Apps after Claroty's React-or-Streamlit optionality), Python backend implied. The Databricks Apps development experience is named as the load-bearing factor that let DBG "ship quickly without standing up separate infrastructure". Pattern instance: patterns/structural-deterministic-logical-llm-split (Apps hosts the structural-conversion stage) + patterns/context-encoded-prompt-handoff (Apps emits the context-encoded prompt that hands off to Genie). 2,000-user migration scope, hours-to-minutes per notebook, business-user-self-service workflow. Forcing function: Cloudera Zeppelin EOL 2027.
-
sources/2026-05-15-databricks-backstage-with-lakebase-part-2 — Third Databricks Apps face: DBA-automation + AI-agent-DB- access deployment substrate. Two open-source Thoughtworks tools both deployed as Databricks Apps and both governed by the same Unity Catalog grants and audit trail as the underlying Lakebase databases they operate on: LakebaseOps (three-agent platform — Provisioning / Performance / Health — replacing 51 historical DBA tickets, plus seven scheduled Databricks Jobs replacing pg_cron, a monitoring UI surfacing live
pg_statmetrics + slow-query regressions + branch TTL enforcement, a 9-KPI adoption dashboard, and a migration wizard scoring ten source engines — Aurora / RDS / Cloud SQL / AlloyDB / Cosmos DB / others — with live AWS + Azure API pricing) and Lakebase MCP (Model Context Protocol server exposing 46 tools to MCP-capable AI agents — Claude / Copilot / GPT — with dual-layer governance: SQL- statement guard + per-tool access guard across four pre- built profilesread_only/analyst/developer/adminthat map onto the same UC GRANT model, plus per-statement tool- tag attribution making "which agent on which branch generated the 4 AM CPU spike?" a one-SQL query). The structural pairing: "LakebaseOps runs for the team. Lakebase MCP runs with the team. Both inherit the governance posture you just saw." Apps is the deployment substrate that lets governance inheritance hold — the apps don't carry their own credential surface; they authenticate as workspace service principals. -
sources/2026-05-13-databricks-the-rosetta-stone-of-cps-clarotys-ai-powered-library — Human-in-the-loop UI face for Entity Resolution. Second canonical Databricks Apps face on the wiki: not just decision-support-app (the clinical-ops face) but the HITL review interface for ER over Lakebase as the operational state. "With the Databricks App and Lakebase, we enable a transparent view and a seamless 'human-in-the-loop' feedback cycle. This intuitive interface allows domain experts to review classifications, correct and enrich entities, and feed high-fidelity, validated data back into our MLflow pipelines and R&D migration." Modern UI frameworks (React or Streamlit) for the frontend — the Streamlit option is new on the wiki, expanding the disclosed frontend stack beyond the FastAPI+React shape from the clinical-ops source. Composes with patterns/orchestrated-multi-agent-entity-resolution (the HITL leg) and systems/lakebase (transactional store for the SME corrections). Canonical wiki instance: systems/claroty-cps-library (17M+ asset CPS catalog).
-
sources/2026-05-13-databricks-clinical-operations-intelligence-belongs-on-the-lakehouse — First wiki disclosure of Databricks Apps as a deployment model. Apps is named as one of three platform primitives (alongside systems/lakebase and AI/BI Genie) that compose into a single-platform application architecture. "Databricks Apps run the web application inside the workspace. The app authenticates as a first-class workspace service principal, queries Unity Catalog via the SQL Statement API, and calls AI/BI Genie over the workspace REST API — all on internal connections." Reference implementation: the open-source Site Feasibility Workbench (FastAPI + React, ~30 min deployment time). Canonical instance of patterns/in-workspace-app-as-decision-support and one leg of concepts/single-platform-application-architecture.
Related¶
- systems/lakebase — operational-DB layer for app state, scale-to-zero, credentialed by the workspace identity system.
- systems/unity-catalog — governance + access-control substrate the app inherits via service-principal identity; queried via SQL Statement API.
- systems/databricks-genie — embedded NL-query layer composed via workspace REST API; not a separate product surface.
- systems/site-feasibility-workbench — reference open-source Databricks App (clinical-trial site selection).
- concepts/single-platform-application-architecture — the architectural thesis Apps is one leg of.
- patterns/in-workspace-app-as-decision-support — the deployment pattern Apps canonicalises.
- patterns/shap-attribution-as-governed-delta-table — the audit pattern enabled by the workspace-resident composition.
- systems/deutsche-borse-zeppelin-converter — customer-built migration tool deployed on Databricks Apps; canonical instance of Apps as a substrate for customer-authored migration tooling.
- patterns/structural-deterministic-logical-llm-split — the pattern realised by an Apps-hosted converter handing off to Genie.
- patterns/context-encoded-prompt-handoff — the prompt-handoff mechanism between an Apps-hosted tool and a workspace LLM agent.