Skip to content

SYSTEM Cited by 1 source

Netflix Amber (media feature store)

Amber is Netflix's feature store for media — the backing service that other Netflix services query for ML features derived from video, image, and other media assets (Source: sources/2024-07-22-netflix-supporting-diverse-ml-systems-at-netflix).

Why on-demand, not precomputed

"While Amber is a feature store, precomputing and storing all media features in advance would be infeasible. Instead, we compute and cache features in an on-demand basis." The volume of media features times the cardinality of Netflix's catalog makes blanket precompute uneconomical; Amber trades some per-request latency for storage efficiency. See concepts/on-demand-feature-compute.

Mechanism: async queue into Metaflow Hosting

                        ┌────────────────────────┐
  service ── request ──►│  Amber                 │
  feature F             │  (feature dependency   │
                        │   graph computation)   │
                        └──────────┬─────────────┘
                                   │ async request(s) for
                                   │ missing features
                        ┌────────────────────────┐
                        │  Metaflow Hosting      │
                        │  queue                 │
                        └──────────┬─────────────┘
                                   │ triggers when
                                   │ compute is available
                        ┌────────────────────────┐
                        │  Feature computation   │
                        │  flow (Metaflow)       │
                        └──────────┬─────────────┘
                                   │ response cached
                        ┌────────────────────────┐
                        │  Metaflow Hosting      │
                        │  cache                 │
                        └────────────────────────┘
                                   │ fetches after compute
                                   │ lands
                        (Amber ────┘)

From the post: "When a service requests a feature from Amber, it computes the feature dependency graph and then sends one or more asynchronous requests to Metaflow Hosting, which places the requests in a queue, eventually triggering feature computations when compute resources become available. Metaflow Hosting caches the response, so Amber can fetch it after a while. We could have built a dedicated microservice just for this use case, but thanks to the flexibility of Metaflow Hosting, we were able to ship the feature faster with no additional operational burden."

This is the canonical wiki instance of patterns/async-queue-feature-on-demand.

Scale / disclosure

No numbers on feature count, QPS, cache hit rate, or dependency- graph depth are given in the post. A deeper Netflix post — Scaling Media Machine Learning at Netflix — is linked inline and may be the better primary source when ingested.

Seen in

Last updated · 319 distilled / 1,201 read