DATABRICKS 2026-05-23 Tier 3

Databricks — Scaling for MHHS: how Octopus Energy achieved a 50x cost reduction in margin data engineering¶

Summary¶

The UK's Market-wide Half-Hourly Settlement (MHHS) regulation forces every supplier to move from two meter reads per month → 48 reads per day — a 48× increase in data points per customer driving every margin and settlement calculation. For Octopus Energy (8M+ customers), the projected legacy-architecture cost per settlement date was $23.63 — a 33× increase over historical norms, projected at +$1M / year in unsustainable compute. The team — three engineers in three months — re-architected the margin data pipeline around the article's central insight: grain misalignment. The legacy pipeline ran everything at the finest grain because billing was monolithically monthly; MHHS introduced a fundamental split (industry settlement = half-hourly, smart-tariff revenue = half-hourly, standard-tariff revenue = monthly), and processing all three through one pipeline meant "processing the entire dataset on every run, regardless of what had actually changed". The rebuild splits into three independently-tunable streams (Settlement / Half-Hourly / Monthly) orchestrated by a "Job of Jobs" pattern, sitting on a unified multi-grain "downstream consumption layer" that became the highest-leverage optimisation site. The single largest win came from Delta Change Data Feed-based incremental processing of the upstream consumption tables: rows processed per run dropped from 25 billion → 300 million (98.8% reduction) and freshness improved from weekly → daily. Spark/Delta optimisations layered on top: broadcast joins for reference tables under 500 MB, liquid clustering on filter/join columns, column and row pruning at the earliest stage, lineage simplification by consolidating data early, and a counter-intuitive but load-bearing move — removing custom optimisation code and trusting Spark's Adaptive Query Execution. Databricks Serverless is named as the development-velocity enabler that made the three-month delivery window viable: zero cluster startup time + side-by-side run comparisons in the Serverless UI for isolating the effect of individual optimisations. Result: $0.48 per settlement date — not just 50× below the projected MHHS cost, but 2× cheaper than the legacy system despite processing 48× more data. Annualised cost avoidance ~$1M (figure excludes the upstream-incremental savings, which are additional). The article generalises four transferable takeaways: grain misalignment is the hidden cost driver, incremental processing transforms pipeline economics, remove before you add, trust the optimiser.

Key takeaways¶

Grain misalignment is the architectural antipattern when business signal splits. "The legacy pipeline had been built around a single grain: monthly. Billing ran monthly. Settlement ran monthly. The entire pipeline was monolithic by design. MHHS introduced a fundamental split. […] Running all three through a single monolithic pipeline meant processing the entire dataset on every run, regardless of what had actually changed." The cost penalty is structural — every run pays the finest-grain price even for the coarsest-grain consumer. Canonicalised as concepts/grain-misalignment.
Three streams, one source of truth — split the pipeline by the natural grain of each consumer. "The team re-architected around three specialised streams, each optimised independently for its natural grain" — Settlement (half-hourly for regulatory cost allocation, "matches that grain exactly"), Half-Hourly (smart-tariff customers — EV drivers, heat pumps, time-of-use products, "where the half-hourly price signal is the entire commercial proposition"), Monthly (standard tariffs, unchanged grain but reconcilable against half-hourly). All three converge on a "unified, multi-grain source of truth consolidating meter reads, smart meter data, and industry flows at multi-terabyte scale" that is "the reconciliation bridge between monthly billing and half-hourly settlement". Canonicalised as patterns/grain-aligned-stream-split + systems/octopus-margin-data-pipeline.
A "Job of Jobs" orchestrates dependencies and parallel execution across the three streams. "A 'Job of Jobs' orchestration pattern manages dependencies and parallel execution across all three streams. Each stream is independently tunable — what works as a Spark optimisation for Settlement is not necessarily right for NHH." Canonicalised as patterns/job-of-jobs-orchestration — the pipeline-of-pipelines orchestration shape that lets each constituent pipeline carry its own tuning profile while a higher-level scheduler enforces order and parallelism.
CDF-based incremental processing was the single highest-leverage optimisation: 25 B → 300 M rows per run (98.8% cut) + weekly → daily freshness. "Delta Lake's Change Data Feed (CDF) made true incremental processing viable at this grain. Instead of complete overwrites, the pipeline now reads only records that have actually changed since the last run. The result: rows processed per run dropped from 25 billion to 300 million — a 98.8% reduction. Data freshness improved from weekly to daily." The savings figure ($1M / yr) excludes this upstream-table incremental win — "The full efficiency gain is larger". Canonicalised as patterns/cdf-incremental-replacing-full-rescan and added as a new face on Delta CDF (the multi-terabyte-upstream-consumption-table face, distinct from the medallion-Bronze-to-Silver face from Claroty's source).
Broadcast joins for reference tables under 500 MB beat shuffle on multi-key joins with date ranges. "Broadcast joins for reference tables under 500MB, eliminating expensive shuffle operations on complex multi-key joins with date ranges." Canonicalised as patterns/broadcast-join-for-small-reference-tables — a Spark-tuning pattern with a concrete size threshold disclosed.
Liquid clustering replaced fixed partitioning on filter/join columns; avoided the small-file problem. "Liquid clustering was enabled across multiple tables for columns frequently used in filters and joins. Liquid clustering dynamically co-locates related records on the specified clustering keys without requiring fixed partition boundaries. Liquid clustering avoids the small-file problem, higher memory consumption, and I/O overhead that come from over-partitioning." Promoted to its own page: systems/liquid-clustering.
Trust the optimiser — AQE outperformed hand-tuned logic, so the team removed the hand-tuning. "In several cases, Spark's Adaptive Query Execution (AQE) outperformed hand-tuned logic. The team removed custom optimisation code and let AQE do its job." Canonicalised as systems/spark-aqe (the optimiser as a system actor) — the principle is that the optimiser, when given enough runtime statistics, beats static heuristics encoded in user code; the action item is to delete code, not write more.
Remove before you add — auditing existing optimisation choices is as impactful as adding new ones. "That last point bears emphasis: removing unjustified compute operations was as impactful as adding new optimisations. If you are running Z-ordering or ANALYZE without measuring their effect, they may be costing you more than they are saving." Canonicalised as concepts/remove-before-add-optimization — a counter-instinct architectural principle: assume each existing optimisation has a cost and demand a measurement before keeping it. Enumerated targets: Z-ordering, ANALYZE, custom shuffle logic.
Lineage simplification + column/row pruning at the earliest stage are the I/O-reduction levers. "Simplified lineage by consolidating data early in the pipeline, reducing downstream joins and shuffle operations. Data pruning: selected only the columns strictly necessary for settlement and pruned rows at the earliest possible stage, reducing I/O overhead before expensive transformations." Two named optimisation categories: lineage and I/O reduction (consolidate-early + column-prune-early + row-prune-early) and join and partition tuning (broadcast joins + liquid clustering).
Databricks Serverless is the development-velocity enabler, not just a runtime. "Databricks Serverless made the three-month delivery window viable. Zero cluster startup time meant the team could iterate rapidly — write, run, measure, adjust — without waiting for infrastructure to provision. The Serverless UI enabled side-by-side run comparisons, making it practical to isolate the effect of individual optimisations." Quote: "The testing and development process could not have been done without serverless. Using the serverless UI helped us to identify bottlenecks and make easy comparisons between different runs." Adds a new face to Serverless Compute — the iteration-loop accelerator face, distinct from the stability-as-system-property face from the 2026-05-06 source.

Final cost geometry: $0.48 per settlement date is 50× below the projected MHHS cost AND 2× below the legacy. Verbatim numbers from the article's table:

Metric	Before	After	Change
Rows processed per run	25 billion	300 million	98.8% reduction
Cost per settlement date (projected MHHS)	$23.63	$0.48	~50× reduction
Cost per settlement date (vs legacy)	$0.71	$0.48	2× more efficient
Savings per month-end run	—	~$83,000	vs unoptimised projection
Annualised cost avoidance	—	~$1,000,000	excludes upstream savings
Data freshness	Weekly	Daily	7× improvement
Build time	—	3 months	Team of three

"The $0.48 per settlement date is not just a 50x reduction from the MHHS projected cost — it is 2x cheaper than the legacy system had ever been, despite processing 48x more data points. Re-architecture delivered regulatory compliance and made the system materially more efficient than the one it replaced." 12. Generalisation: any monthly→daily / daily→real-time / aggregate→transactional transition triggers the same dynamics. "MHHS is a UK energy regulation. However, the pattern it represents — a regulatory or business event that multiplies data volume at a finer grain — is not unique to energy. Any time a system moves from monthly to daily, daily to real-time, or aggregate to transactional, the same dynamics apply." Four transferable takeaways named: grain misalignment is the hidden cost driver; incremental processing transforms pipeline economics; remove before you add; trust the optimiser.

Architecture extracted¶

The three-stream split¶

                   Industry data flows (half-hourly, 48/day/customer)
                                    │
                                    ▼
                  ┌───────────────────────────────┐
                  │  Unified multi-grain source   │
                  │  of truth (multi-terabyte)    │ ← incremental via Delta CDF
                  │  meter reads + smart meter +  │   (25 B → 300 M rows/run)
                  │  industry flows               │
                  └───────────┬───────────────────┘
                              │
            ┌─────────────────┼─────────────────┐
            ▼                 ▼                 ▼
      ┌──────────┐      ┌──────────────┐  ┌──────────┐
      │Settlement│      │ Half-Hourly  │  │ Monthly  │
      │  stream  │      │   stream     │  │  stream  │
      │ (HH grain│      │ (smart       │  │ (standard│
      │  for     │      │  tariffs:    │  │  tariffs)│
      │  cost    │      │  EVs, heat   │  │          │
      │  alloc)  │      │  pumps, ToU) │  │          │
      └──────────┘      └──────────────┘  └──────────┘
            └──────── Job of Jobs orchestration ────┘
                  (dependency mgmt + parallel exec)

Each stream is independently tunable — "what works as a Spark optimisation for Settlement is not necessarily right for NHH". The shared upstream consumption layer is the reconciliation bridge between monthly billing and half-hourly settlement.

Optimisation taxonomy (four categories disclosed)¶

Category	Levers applied	Lever NOT applied
Lineage and I/O reduction	Consolidate data early; column pruning; row pruning at the earliest stage	—
Join and partition tuning	Broadcast joins for ref tables <500 MB; liquid clustering on filter/join columns	Fixed partition boundaries (rejected as cause of small-file problem)
Trusted the optimiser	Removed custom optimisation code; let AQE run	Hand-coded shuffle logic (deleted)
Remove before you add	Audited Z-ordering and ANALYZE for measured benefit	Z-ordering / ANALYZE without measurement

The fourth category is the one most engineers don't have in their toolkit: un-optimisations. The article frames it as a co-equal lever to adding new ones.

CDF-based upstream incremental processing¶

naive shape (rejected):
upstream-consumption-table  ──FULL OVERWRITE──► downstream pipelines
  multi-terabyte             every run             25 B rows / run

CDF shape (adopted):
upstream-consumption-table ─CDF: changed rows─► downstream pipelines
  multi-terabyte             since last run        300 M rows / run
                             (98.8% reduction)

The article explicitly notes the headline $1M annualised figure excludes the upstream-incremental savings — the full efficiency gain is materially larger.

Cost & freshness numbers (verbatim)¶

Rows / run: 25 B → 300 M (98.8% reduction)
Cost / settlement date (projected MHHS): $23.63 → $0.48 (~50× reduction)
Cost / settlement date (vs legacy): $0.71 → $0.48 (2× cheaper than legacy)
Savings / month-end run: ~$83,000
Annualised cost avoidance: ~$1M (excludes upstream savings)
Data freshness: weekly → daily (7×)
Build time: 3 months, team of three
Volume change driver: 2 reads/customer/month → 48 reads/customer/day (48×)

Operational caveats¶

The $1M / yr figure is a floor, not the full saving. The article is explicit that the figure "exclude[s] the additional savings from this move to incremental processing on upstream tables. The full efficiency gain is larger." Treat the headline as conservative.
Legacy was already $0.71 — the comparison is not "legacy vs after"; it's "legacy vs MHHS-projected vs after". The architectural argument is that the rebuild made the system cheaper than the legacy that handled 1/48 the volume — a regulatory mandate became a net efficiency win.
Liquid clustering's "no fixed partition boundaries" is the explicit reason the team didn't reach for traditional partitioning. The small-file / over-partitioning trap is named as the failure mode being avoided.
AQE is not a panacea — the article says "in several cases" AQE outperformed hand-tuning, not always. The principle is measure first, then choose — but the default should be the optimiser, with hand-tuning earning its place.
The "remove before add" principle has an implementation requirement: measurement. Without measurement there is no way to know which optimisation is or isn't paying for itself. Z-ordering and ANALYZE are named because they're the most commonly-applied Delta optimisations that are also rarely measured.
The article does not disclose the orchestration runtime for "Job of Jobs" — likely Lakeflow Jobs / Databricks Workflows but not named explicitly.
The half-hourly settlement architecture is UK-energy-specific in name, but the article generalises — the pattern fires whenever a regulatory or business event multiplies data volume at a finer grain than the existing pipeline assumes.
Three engineers, three months for the rebuild is the disclosed team size — a disclosed constraint that argues for serverless development velocity over self-managed infra.

Source¶

Companies: companies/databricks · companies/octopus-energy
Systems: systems/octopus-margin-data-pipeline · systems/delta-lake · systems/apache-spark · systems/spark-aqe · systems/liquid-clustering · systems/databricks-serverless-compute
Concepts: concepts/grain-misalignment · concepts/data-pipeline-grain · concepts/remove-before-add-optimization · concepts/delta-change-data-feed
Patterns: patterns/grain-aligned-stream-split · patterns/cdf-incremental-replacing-full-rescan · patterns/broadcast-join-for-small-reference-tables · patterns/job-of-jobs-orchestration