SYSTEM Cited by 3 sources
Netflix Data Gateway Platform¶
Netflix Data Gateway is the platform layer underneath Netflix's Data Abstraction Layer (DAL) services. It hosts the Data Abstraction Layer services that sit between microservices and online datastores (Cassandra, EVCache, DynamoDB, RocksDB, etc.) and "enables us to support the broad spectrum of use cases that Netflix demands with minimal developer effort."
Within this wiki, the Data Gateway is a stub page — named as the parent platform in the sources/2024-09-19-netflix-netflixs-key-value-data-abstraction-layer|2024-09-19 KV DAL post but not yet deep-dived. The canonical explanation is Netflix's earlier post, Data Gateway: A Platform for Growing and Protecting the Data Tier, which has not yet been ingested on the wiki — expansion is expected when it is.
Role in the KV DAL story¶
The 2024-09-19 KV post names the Data Gateway only in passing:
"To overcome these challenges, we developed a holistic approach that builds upon our Data Gateway Platform. This approach led to the creation of several foundational abstraction services, the most mature of which is our Key-Value (KV) Data Abstraction Layer (DAL)."
Two points follow:
- KV DAL is not the only abstraction service on the Data Gateway. The post says "several foundational abstraction services" — KV is the mature exemplar, but other DALs (e.g. time-series, graph, etc.) presumably run on the same platform.
- Data Gateway provides the shared infrastructure on which DAL services can enforce the cross-cutting properties KV relies on: namespace-routed storage, in-band signaling, consistency-scope / consistency-target config, policy-layer access control.
Mature abstractions named¶
As of the 2024-11-13 Distributed Counter post, the wiki can name three mature abstractions hosted on the Data Gateway, with direct composition between them:
- KV DAL — gRPC point-lookup service with a two-level-map data model over Cassandra + EVCache + DynamoDB + RocksDB.
- TimeSeries Abstraction — event store for temporal event data, backed by Cassandra with bucketed partitioning.
- Distributed Counter Abstraction — counting service built on top of TimeSeries using it as its event store + EVCache as rollup cache. This composition — one DAL consuming another — is the concrete proof of the platform's "compose multiple abstraction layers using containers deployed on the same host, with each container fetching configuration specific to its scope" property.
The 2026-05-29 [Graph Abstraction Part-I post] (sources/2026-05-29-netflix-high-throughput-graph-abstraction-at-netflix-part-i) adds a fourth abstraction:
- Graph Abstraction — strongly-typed property-graph service composed over multiple KV namespaces (per-node-type + per-edge-type forward link + reverse link + edge property), optionally over TimeSeries Abstraction (historical view), with EVCache as the read-aside property cache. The most complex multi-DAL composition shape on the platform to date.
Data Gateway as graph-schema authority¶
The 2026-05-29 post adds a new face for Data Gateway: it is
the schema authority for Graph
Abstraction. Each graph namespace is associated with an
explicit graph schema configured in the Control Plane: edge
mappings (fromNodeType, edgeType, toNodeType,
directionType) extended with property schemas
(propertyKey, propertyValueType). Graph Abstraction servers
poll the Control Plane periodically for schema updates and
build an in-memory
metadata graph from it
(patterns/schema-aware-traversal-planning).
Capacity-modelled provisioning¶
The same post discloses the provisioning automation Data Gateway uses for namespace deployment. Verbatim: "the optimal, most cost-effective hardware configuration is determined by our provisioning automation, based on user-provided requirements such as throughput, latency, dataset size, and workload criticality." Customers describe workload requirements, not a hardware spec — the Control Plane resolves throughput / latency / dataset-size / criticality into the right physical storage layer (dedicated or shared), with the actual algorithm open-sourced at Netflix-Skunkworks/service-capacity-modeling and described in Joey Lynch's AWS re:Invent talk.
Seen in¶
- sources/2024-09-19-netflix-netflixs-key-value-data-abstraction-layer — named as the platform on which KV DAL is the most mature abstraction service.
- sources/2024-11-13-netflix-netflixs-distributed-counter-abstraction — names the Distributed Counter Abstraction as the third mature abstraction, built on top of the TimeSeries Abstraction, all deployed via the same Data Gateway control plane.
- sources/2026-05-29-netflix-high-throughput-graph-abstraction-at-netflix-part-i — fourth-mature-abstraction face: Graph Abstraction is the most complex multi-DAL composition on the platform; the Control Plane is canonicalised as the graph-schema authority for the in-memory-metadata-graph + capacity-modelled-provisioning pattern.
Related¶
- systems/netflix-kv-dal — mature DAL service on this platform.
- systems/netflix-timeseries-abstraction — sibling DAL on the same platform, event store for temporal data.
- systems/netflix-distributed-counter — third mature DAL; composed on top of TimeSeries.
- systems/netflix-graph-abstraction — fourth mature DAL; composes KV + TimeSeries + EVCache.
- patterns/data-abstraction-layer — the architectural shape Data Gateway hosts.
- patterns/schema-aware-traversal-planning — Graph Abstraction's schema-driven query-time optimisations live on this Control Plane authority.
- concepts/in-memory-schema-metadata-graph
- concepts/database-agnostic-abstraction
- companies/netflix