SYSTEM Cited by 1 source
Netflix FPD + CEA (Cloud Efficiency data platform)¶
FPD (Foundational Platform Data) + CEA (Cloud Efficiency Analytics) is the two-layer internal data platform that the Netflix Platform DSE (Data Science Engineering) team uses to attribute cost-and-ownership across Netflix's AWS footprint to specific teams, services, and organisations. FPD is the normalised substrate (inventory + ownership + usage); CEA is the business-logic layer that turns that substrate into attributed-cost time-series consumable by engineering organisations.
Architecture¶
Two distinct layers with a deliberate separation of concerns:
- FPD — Foundational Platform Data. Ingests from each Netflix platform (e.g. Spark) three normalised streams:
- Inventory — what resources exist.
- Ownership — which team / user / org owns them.
- Usage — how those resources were exercised over time.
FPD establishes data contracts with each platform's owners to guarantee data quality and reliability, and transforms heterogeneous platform emissions into a consistent data model for ownership. The standardised model is what makes the downstream analytics layer scalable. - CEA — Cloud Efficiency Analytics. Consumes FPD, applies per-platform business logic ("cost heuristics are unique to each platform"), and produces time-series efficiency metrics at multiple aggregation granularities. CEA is described as "compartmentalized and transparent": downstream consumers can trace why a given dollar shows up under their org and how it was calculated.
Design principles¶
From the post:
- Accuracy, reliability, accessibility. The team's stated tenants — efficiency data is only useful if it's trusted.
- Documented. "Comprehensive documentation to navigate the complexity of the efficiency space" — Netflix treats documentation as a first-class deliverable because the underlying model (owners, cost heuristics, multi-tenancy) is inherently complex.
- SLAs published. "Well-defined Service Level Agreements (SLAs) to set expectations with downstream consumers during delays, outages or changes" — cost data is treated as a production data product, not a spreadsheet.
- Single-owner resolution + multi-tenant distribution. "For cost accounting purposes, we resolve assets to a single owner, or distribute costs when assets are multi-tenant."
- Multi-aggregation output. "We do also provide usage and cost at different aggregations for different consumers." — same substrate, multiple consumer-shaped views.
Three named program tensions¶
(from the post)
- "A Few Sizes to Fit the Majority" — every platform has per-platform customisation that doesn't fit one data-model mould. Netflix's answer is ongoing explicit negotiation with producers + consumers rather than a single rigid schema.
- "Data Guarantees" — audits + per-layer health visibility are load-bearing for trust; "maintaining data completeness while ensuring correctness becomes challenging due to upstream latency and required transformations."
- "Abstraction Layers" — when an internal platform team builds a SaaS on top of another internal platform, cost attribution has to chase the abstraction chain. FPD's clean inventory/ownership/usage separation is the insulator that lets CEA produce sensible numbers regardless of whether a given user builds on AWS directly or on a Netflix-internal SaaS layered above it.
Forward direction¶
"Longer term, we plan to extend FPD to other areas of the business such as security and availability." + "We aim to move towards proactive approaches via predictive analytics and ML for optimizing usage and detecting anomalies in cost."
In other words: (1) generalise FPD's substrate discipline beyond cost into other cross-platform metrics; (2) move CEA from descriptive to prescriptive — anomaly detection and optimisation recommendations rather than just dashboards.
Related patterns¶
- patterns/chargeback-cost-attribution — FPD/CEA is the pre-chargeback substrate: the attributed-cost time-series that a Netflix chargeback mechanism would consume, rather than the chargeback mechanism itself.
- concepts/capacity-efficiency — the Meta framing of the same problem space. Netflix's post focuses on upstream data correctness and transparent attribution as the substrate for capacity-efficiency work; Meta's focuses on the offense/defense/AI-agent optimisation loop that sits above such a substrate.
Seen in¶
- sources/2025-01-02-netflix-cloud-efficiency-at-netflix — canonical disclosure; two-layer architecture (FPD → CEA) + data contracts + three named program tensions + future-work toward predictive anomaly detection.
Related¶
- companies/netflix
- concepts/data-contract
- patterns/chargeback-cost-attribution
- concepts/capacity-efficiency
- concepts/cost-tracking-per-team — sibling LLM-ops variant of per-team cost visibility.
- systems/apache-spark — named platform onboarded to FPD in the post's worked example.