Skip to content

DATABRICKS 2026-05-19 Tier 3

Read original ↗

Databricks — How Deutsche Börse built a generative AI tool to tackle the large-scale migration of Zeppelin notebooks to Databricks

A 2026-05-19 Databricks customer-blog post co-authored with Deutsche Börse Group (Frankfurt-headquartered financial-market-infrastructure operator; clearing + trading data backbone for ~95% of the group's analytics) describing the Zeppelin to Databricks Notebook Converter, a Databricks App that converts Apache Zeppelin notebooks (running on the to-be-decommissioned Cloudera platform) into Databricks-native .ipynb notebooks with a context-augmented prompt for downstream LLM-assisted logic reconstruction via Genie.

The post is operationally a launch + lessons-learned narrative, but its architectural payload — separate the deterministic part of a migration from the non-deterministic part, and apply the right tool to each — is reusable across any heterogeneous-code-migration shape (notebook ↔ notebook, query language ↔ query language, framework ↔ framework). The post makes that thesis explicit and pins it with a concrete deployment.

Summary

Deutsche Börse's StatistiX platform serves Clearing & Trading data to hundreds of business users, predominantly via Zeppelin notebooks running on Cloudera with HDFS + Oracle backends. Cloudera is fully decommissioning Zeppelin in 2027 and the group has selected Databricks as the unified analytics replacement. The migration target is 2,000+ users and a high volume of notebooks — many embedding institutional knowledge and business-specific custom logic accreted over years.

Manual rewriting was infeasible (years of effort) and rule-based rewriting was infeasible (the body of notebooks is too heterogeneous to reduce to deterministic transformations). The team's design insight: the migration is not one problem but two, and they have very different solutions.

  1. Structural conversion — Zeppelin paragraphs → Databricks cells, interpreter prefix translation (%python, %sql, %pyspark), notebook metadata reformatted into valid .ipynb JSON. Deterministic. Automatable. Original content preserved exactly.
  2. Logical reconstruction — SQL/Python logic, custom Zeppelin interpreters, HDFS/Oracle references, visualisations, widgets, scheduling logic, business-specific custom code. Heterogeneous. Per-notebook. Best handled by an LLM that can interpret context and ask clarifying questions.

The converter (a Databricks App with a shadcn UI frontend, evolved from a Streamlit prototype) automates only step 1. For step 2 it generates a context-aware prompt that encodes Deutsche Börse's specific Zeppelin environment — custom interpreters, data sources, configuration patterns — and hands it off to Genie inside the user's Databricks workspace. The user pastes the prompt; Genie asks clarifying questions and rebuilds the notebook.

Net result: per-notebook redevelopment effort drops from hours of manual work to 15–20 minutes, and the migration becomes business-user self-service — no dedicated engineering team required to migrate notebooks one at a time.

Key takeaways

  1. Separate structure from logic, apply the right tool to each. "Structural conversion (mapping Zeppelin's paragraph format to Databricks cells, translating interpreter syntax, reformatting metadata) is deterministic and automatable, while logic reconstruction is not. Thankfully, LLMs are great at this structural conversion part." The architectural insight is that what looks like a single migration problem is two problems with very different cost curves and very different correctness profiles. Deterministic transforms (paragraph → cell, interpreter prefix mapping, JSON reformat) compile to rules cleanly; non-deterministic transforms (logic, references, custom interpreters) compile to "errors that undermine trust in the output" if forced through rules. (Source: this post.)

  2. Heterogeneous code defeats rule-based migration engines structurally, not incrementally. "The diversity across the entire notebook landscape made a rule-based rewriting engine impractical, since the logic was simply too heterogeneous and too business-specific for automated rules to handle reliably." This is not a "we didn't write enough rules" problem — it's a "the input space is fundamentally non-uniform" problem. Each notebook "reflected institutional knowledge from the business teams who relied on it" — meaning the variability is in the customer data, not in the tool's coverage. Adding more rules cannot close the gap. (Source: this post.) See concepts/heterogeneous-code-migration.

  3. Generic LLM prompts produce generic results; environment-encoded prompts produce usable ones. "Context is the difference between a good prompt and a great one. Generic Genie prompts produce generic results. Investing in a prompt that encodes knowledge of our specific environment — interpreters, data sources, configuration patterns — is what made the output actually usable." The converter doesn't just hand the user a notebook; it hands the user a prompt template populated with Deutsche Börse's environment. The encoded context is what closes the "this is technically Python, but Genie has never seen Deutsche Börse's custom Zeppelin interpreters" gap. (Source: this post.) See concepts/context-encoded-llm-prompt.

  4. Avoid overengineering: simple UI + clean backend beats agentic architecture for bounded migration tasks. "Our first attempt used a more complex agentic architecture that added overhead without solving the core problem. A simple UI and a clean backend turned out to be exactly sufficient." The team initially reached for a multi-agent system; they discovered that the migration task is well-bounded enough (one input file, one output file, one prompt) that an agentic loop adds latency, complexity, and failure modes without improving outcomes. Linear app + LLM hand-off won over autonomous agent loop — a notable disconfirmation of the "everything must be agentic" reflex in 2026 LLM tooling. (Source: this post.)

  5. Hours-to-minutes per notebook + business-user self-service is the migration economics that makes 2,000-user platform moves tractable. "By combining structural conversion with AI-assisted logic reconstruction, we've reduced notebook redevelopment from hours of manual effort to 15–20 minutes per notebook, depending on complexity." The speed gain is one half; the access-pattern gain is the other half. "Business users don't need deep Databricks expertise to migrate their own notebooks. They follow a short sequence of steps, get a prompt, and let Genie do the reconstruction." The migration tool becomes a self-service workflow: export Zeppelin JSON → upload → click Convert → download .ipynb → upload to Databricks → paste prompt into Genie → answer clarifying questions. No engineering team in the loop per notebook. (Source: this post.)

  6. The handoff between automation and AI is the design surface, not the automation or the AI. "This hybrid approach of automating the deterministic part and delegating the variable part allows us to avoid the brittleness of rule-based systems and leverage AI where it actually performs well." The most-cited single sentence in the post — and the one most reusable across other heterogeneous-migration shapes — is the framing of the handoff as the load-bearing design decision. The structural converter explicitly preserves all logic untouched (SQL, Python, visualisations, widgets, HDFS/Oracle references) so that the LLM has the user's exact intent to work from rather than a partially-rewritten approximation of the user's intent. The deterministic stage's job is to set up the LLM stage; the LLM stage's job is to interpret context the rules can't. (Source: this post.) See patterns/structural-deterministic-logical-llm-split.

  7. Engage the platform team early to avoid rework. "Our collaboration with the Databricks team throughout the build helped us stay aligned and avoid rework." A standard joint-engineering observation, but worth pinning: large enterprise migrations onto vendor platforms benefit from continuous platform-team collaboration over arms-length adoption, because the constraints discovered during build (e.g. Databricks Apps deployment model, Genie prompt-context window, .ipynb schema variants) shape the architecture in ways the customer cannot anticipate from documentation. (Source: this post.)

  8. Customer-built migration tooling is now durable enough to ship as a Databricks App. The team initially built a Streamlit prototype, then upgraded to a shadcn UI frontend on the Databricks Apps platform for "a more professional and scalable interface" and "the Databricks Apps development experience made it straightforward to ship quickly without standing up separate infrastructure." This is one of the first publicly-disclosed customer Databricks App deployments where the App is itself a migration utility (rather than the app being an analytics or decision-support workload). The pattern generalises: customer-authored migration tooling can run inside the destination platform's app substrate and be discovered/adopted by other tenants of the same platform. (Source: this post.)

Architecture extracted

The Zeppelin → Databricks Notebook Converter

A Databricks App (frontend: shadcn UI; backend: Python; deployment: Databricks Apps platform inside customer workspace) that performs structure-only conversion of a single uploaded Zeppelin notebook export to a Databricks-compatible .ipynb file, plus emits a context-augmented prompt for downstream LLM-driven logic reconstruction.

Conversion-stage transforms (deterministic, rule-based):

  • Paragraph → cell. Each Zeppelin paragraph (the unit of execution in Zeppelin) becomes a Databricks cell (the unit of execution in Databricks). Same content, new container format.
  • Interpreter prefix mapping. Zeppelin's %python, %sql, %pyspark, (and others) — the per-paragraph interpreter directives — are translated to their Databricks-native equivalents. The mapping is finite and known.
  • Metadata → valid .ipynb JSON. Zeppelin's notebook-level metadata (kernel info, layout, configuration) is reformatted into the Jupyter notebook JSON schema that Databricks consumes.
  • Original content preserved exactly. SQL strings, Python code, visualisation specs, widget definitions, scheduling fragments, HDFS/Oracle references — none of these are rewritten. They are copied verbatim into the target notebook.

Prompt-generation stage (per notebook, automatic):

  • Context block populated from environment. The prompt template embeds Deutsche Börse-specific facts: their custom Zeppelin interpreters, their HDFS+Oracle data-source patterns, their configuration conventions. (Without this, Genie does not know that "%spd" means "the StatistiX Spark interpreter with Oracle credentials pre-bound" or that a particular HDFS path corresponds to a particular logical table.)
  • Output: a single prompt the user pastes into Genie. The user does not configure or assemble the prompt — the converter generates it on every run.

The user workflow:

  1. Export the Zeppelin notebook as JSON (out-of-band, in Cloudera).
  2. Upload the JSON into the converter app.
  3. Click Convert.
  4. Download the converted .ipynb.
  5. Open Databricks, upload the notebook, launch Genie, paste the generated prompt.
  6. Genie asks clarifying questions and rebuilds the logic in a Databricks-native form.

The hand-off boundary between (3) and (5) is the architectural seam: the deterministic side stops there; the LLM side starts there.

What was intentionally left out of the converter

The post is unusually explicit about this: "One of the most important design decisions was determining what the tool should intentionally leave alone."

Items the converter does not rewrite:

  • SQL logic
  • Python logic
  • Visualisations
  • Widgets
  • Oracle / HDFS references
  • Scheduling logic
  • Business-specific custom code (custom Zeppelin interpreters, internal helper imports)

The rationale is uniform across all of them: "All of that content is preserved in the converted notebook, untouched, because rewriting it automatically would introduce errors and undermine trust in the output. These are exactly the elements that vary most across notebooks and that carry the most business-critical logic. They belong to Genie, which can interpret context, ask clarifying questions and make judgment calls that rules cannot."

This is the cleanest articulation in the wiki of the negative space of rule-based migration — the deliberate decision not to rewrite — as a first-class design choice.

Pre-implementation rejected design: agentic architecture

"Our first attempt used a more complex agentic architecture that added overhead without solving the core problem." The team explicitly rejects an agent-loop deployment shape after building it, in favour of a simple UI + clean backend. The post reads as a deliberate counter-cyclical signal against the 2026 industry default of "reach for an agent first".

Frontend evolution: Streamlit → shadcn UI

Streamlit was the prototype; the production app uses shadcn UI for "a more professional and scalable interface." The team credits the Databricks Apps development experience with making this swap feasible "without standing up separate infrastructure". shadcn UI is not architecturally load-bearing for the core thesis, but it is a marker of the production-quality UX expected of a tool 2,000+ business users will run.

Operational numbers

  • 2,000+ users to migrate from Zeppelin (on Cloudera) to Databricks.
  • 2027 Cloudera Zeppelin decommissioning deadline (forcing function).
  • Hours of manual effort → 15–20 minutes per notebook for redevelopment (depends on complexity).
  • ~95% of all Clearing and Trading data at Deutsche Börse Group flows through StatistiX (the platform being migrated).
  • Hundreds of business users are direct StatistiX consumers.

The numbers are bounded: the post does not disclose total notebook count, success rate of generated prompts on first pass, fraction of notebooks needing human reconciliation after Genie reconstruction, or compute cost per converted notebook. These reservations are noted in Caveats.

Caveats and reservations

  • Tier-3 customer-blog launch post. Architecture density is moderate (~30%) — passes the "Borderline cases — include, don't skip: Product launches THAT ALSO contain deep architecture sections" test in AGENTS.md because the structural-vs-logical-split thesis is concretely instantiated, but readers should treat the "hours to minutes" claim as the team's own measurement, not a benchmarked one.
  • Single deployment, in pilot. The post explicitly states "the initial development of our converter tool is complete, we are now proceeding with large-scale, real-world testing" — meaning the tool has not yet handled the 2,000-user migration at scale. Open question: how often does Genie's clarifying-question loop converge on a correct rebuild without human SQL/Python edits?
  • Genie quality is the load-bearing assumption. The structural-vs-logical split is only economically viable if the LLM stage handles the heterogeneous logic well. The post does not quantify Genie's accuracy on Zeppelin → Databricks logic translation; it only quantifies the time savings. Failure mode not disclosed: how often Genie generates plausibly-correct-but-subtly-wrong reconstructions of business-critical financial-data logic.
  • Custom-interpreter generalisability uncertain. The context-encoded prompt is hand-tuned for Deutsche Börse's specific Zeppelin interpreters. The post's "finalising prompt definitions to improve accuracy" line in the What's next section suggests prompt iteration is ongoing.
  • Tool not (yet) generally available. The converter appears to be customer-built tooling for Deutsche Börse's specific migration. Whether Databricks productises it as a platform feature is not stated. (Other Databricks customers facing the same Cloudera Zeppelin 2027 cliff may need to rebuild the same tool.)
  • No discussion of HDFS/Oracle data-source migration. The notebook content is migrated; the underlying data systems (HDFS data, Oracle data) are referenced in the notebooks but their migration story is out of scope for this post. A complete StatistiX move presumably also requires a data-tier migration that this writeup does not cover.
  • No tooling code disclosed. Unlike the AWS / Synthesia G7e post (with aws-samples/sample-asynchronous-video-decoding), this post does not link a public reference implementation. The converter design is described but not open-sourced.

Source

Last updated · 542 distilled / 1,571 read