Skip to content

PATTERN Cited by 1 source

Chargeback / Cost Attribution

Pattern: the infrastructure tier that does the expensive thing (egress, compute, storage) also records the cost-driving metric (bytes-transferred, CPU-seconds, row-scans) with enough granularity to attribute the cost back to a specific owning entity (team, data product, tenant, customer). A reporting job converts that raw metric into per-entity currency and pushes a bill into an accountability channel.

Without this, shared infrastructure becomes a commons-tragedy cost centre: whoever is closest to the bill complains, but the party whose design decisions drove the bill has no feedback loop.

Mercedes-Benz's realisation

  • Metric at the tier. Delta Deep Clone Sync Jobs record the exact bytes transferred per run, per data product. The jobs are the only place the truth exists — egress happens there, not at the consumer query layer.
  • Reporting Job. A separate daily job aggregates per-job bytes into per-Data-Product egress cost (and compute cost for the sync tier).
  • Chargeback direction: producer, not consumer. The upstream data producer is billed for replication egress, not the Azure consumer team. This is the meaningful design choice — it puts the cost where the design lever is. A consumer can't usefully reduce egress; the producer can, by modelling the data better, sharding it more cleanly, or gating which columns participate in sharing.
  • Dashboard as the forcing function. Cost dashboard per data product makes the bill visible to the team whose design choices drive it — which is the whole point.

(Source: sources/2026-04-20-databricks-mercedes-benz-cross-cloud-data-mesh)

Key design decisions

  1. Attribute at the source of the cost. Don't try to back-derive per-team egress from a flat cloud bill — capture it where the tier is actually moving bytes. The Sync-Job-in-the-loop shape makes this almost free, because the job already knows what it copied.
  2. Pick a chargeback direction with the right incentive. The party being billed must have architectural levers to reduce the bill. Billing consumers for something they can't shrink (cross- cloud topology, dataset size) is noise-not-signal.
  3. Push it into an operational channel, not an annual audit. Daily rollup per data product → owning team's dashboard; not a once-a-quarter spreadsheet. Feedback loop latency matters.
  4. Include compute cost if it's material. Mercedes-Benz's dashboard tracks compute cost of the Sync Jobs alongside egress, so producers see the full cost of their replication strategy, not just bytes.

Why it matters

The pattern is the governance primitive that keeps patterns/cross-cloud-replica-cache economically healthy over time. Without chargeback, producers have no incentive to keep the shared data set lean — they externalise the cost onto whoever is paying the egress bill. With chargeback, producers notice when a data product costs 10× its peers and fix it.

Seen in

Last updated · 200 distilled / 1,178 read