Skip to content

SYSTEM Cited by 1 source

Netflix Lightbulb

Lightbulb is Netflix's 2026 routing-metadata resolver for the centralized Model Serving Platform. Unlike Switchboard — which sat in the critical path of every request — Lightbulb is out of the payload path: it consumes minimal request context, resolves it to routing metadata, and returns. The actual request routing to the serving cluster VIP is then performed by Envoy at the data plane, on the basis of a routingKey header Lightbulb provides.

Lightbulb preserves the properties that made Switchboard valuable — single client integration point, Objective-addressable routing, A/B-test-aware model selection, shadow mode + canary — while fixing Switchboard's three scaling pains: single point of failure, serialization tax, and tenant-isolation gaps.

Output contract

For each request, Lightbulb produces two pieces of information:

  • routingKey — placed in HTTP headers. Envoy maps this key → target cluster VIP via its routing-rules config. Small payload, fast parse, no serialization cost for the request body.
  • ObjectiveConfig — placed in the request body. Contains the selected model ID plus request-specific configuration required for model execution (A/B cell context, input type, model metadata). Consumed by the serving host, not by Envoy.

Quoting the post:

"While the routingKey is added to the headers for Envoy proxy to consume, the client adds the ObjectiveConfig parameters to the request itself. This is done to avoid bloating the request headers while passing additional parameters for the model to process the request appropriately."

This headers-for-routing / body-for-model-inputs split is the load-bearing design choice that removes Lightbulb from the payload path. See patterns/separate-routing-from-model-selection.

What Lightbulb does NOT do

  • It is not in the request path for the payload. The client calls Lightbulb (small context in, headers+config out), then sends the real request through Envoy with the headers applied. Lightbulb never serialises/deserialises the payload, which was the Switchboard latency contributor.
  • It does not own connection routing. Envoy owns that based on the routingKey → VIP mapping, sourced from a control plane distinct from Switchboard's centralised proxy role.
  • It is not the whole story. Envoy routes the connection; Lightbulb provides the metadata. Both are required.

Rules flow

Lightbulb consumes the same Switchboard Rules surface researchers authored for Switchboard (the JavaScript → JSON config published via Gutenberg), but the rules are split into two consumer contracts:

  • Model Serving Configuration — which model to run at request time, plus required metadata (the ObjectiveConfig Lightbulb returns).
  • Routing Rules — given the selected model, which VIP should Envoy route to (the routingKey → VIP mapping published to Envoy's control plane).

This dual-consumer design preserves the pub/sub-separated config pattern — independent release cycle from code deploys — while letting Envoy handle connection routing natively.

Why this shape

From the post:

"Envoy is already used for all egress communication between apps at Netflix, and it can route requests to different clusters (VIPs) based on the configurable Routing Rules published from our control plane. However, it lacks the information needed to make routing decisions and the ability to enrich the request body with additional serving parameters required for A/B testing model variants. We introduced Lightbulb to cover this gap."

Summary:

  • Envoy already does connection routing on cluster-VIP topology.
  • Envoy does not do Objective-resolution, A/B-cell lookup, or request-body enrichment — the research-facing decisions.
  • Lightbulb does exactly those decisions, and hands the result to Envoy via headers.

Three named improvements over Switchboard

The post lists three design choices Lightbulb makes explicitly different from Switchboard:

  1. Remove the routing service from the direct request path. Addresses serialization tax + single point of failure.
  2. Separate model inputs from request metadata. Large payloads don't have to be re-serialised by a routing service; they flow through Envoy untouched while small headers carry the routing metadata.
  3. Provide better isolation for the routing layer. Addresses tenant-isolation: Lightbulb can be sharded per use-case tenant; Envoy provides natural cluster-level isolation between serving shards.

Migration posture

The post does not quantify how much of the 1M req/sec Switchboard traffic has migrated to the Lightbulb + Envoy shape, or give before/after latency numbers for the new architecture. It is presented as a design that "retain[s] the advantages of Switchboard" while fixing the scaling pains.

Seen in

Last updated · 445 distilled / 1,275 read