CONCEPT Cited by 2 sources
Graph traversal fanout¶
Graph traversal fanout is the number of edges a traversal expands across at each hop. For a 1-hop traversal from a source node, fanout is the source's degree. For an n-hop traversal, fanout multiplies at each step — a 2-hop traversal from a node with degree 100 against neighbours of average degree 50 generates ~5000 candidate paths.
For OLTP graph workloads, fanout is the structural predictor of latency. Sub-second traversal budgets only hold if either:
- The fanout at each depth is bounded by the schema or by an operational invariant.
- The system parallelises the per-edge work efficiently across the fanout.
- The fanout is reduced by filters / type pruning / property pushdown before reads issue.
Median vs max fan-out¶
The same graph can have very different median and max fanouts at the same depth — a few hub nodes with extreme degree dominate the tail. Operational dashboards therefore track both:
- Median edge fanout at depth n — predicts the typical p50 / p90 traversal latency.
- Max edge fanout at depth n — predicts the p99 / p99.9 traversal latency.
Netflix Graph Abstraction tracks both as first-class operational metrics. Verbatim from Part-I:
"We track the average and max edge fanout at different depths to give us insights into the traversal performance for different graph datasets."
Operational consequences¶
| Property | Why it matters |
|---|---|
| Sub-second traversal latency | Fanout × per-edge cost must fit the budget; high fanout requires aggressive parallelism + caching |
| Memory pressure | Latest-sort traversals fetch then sort in memory; max fanout drives peak memory per request |
| Hub-node hot spots | The same hub appears in many traversals; cache locality on hub neighbourhoods is high-value |
| Tail-fanout outliers | One hub with millions of edges can blow the p99 latency budget on its own |
Netflix names a per-source-node edge cap as an operational invariant — "to ensure optimal performance without exerting too much memory pressure, we aim to limit the number of edges per source node within the system" — explicitly to keep in-memory latest-sort tractable for high-fanout sources. The specific cap is not disclosed.
The 2-hop number for RDG¶
Verbatim disclosure of how fanout drives published latency:
"Currently, the RDG is powered by 2-hop traversals with a higher degree of fan-out. While these operations can reach upwards of 100 ms in latency, the 90th percentile (p90) latency remains under 50ms."
The fanout-vs-latency frontier is therefore not theoretical — it's literally the shape of the published p90/p99 numbers for Netflix's most fanout-heavy traversal workload.
How systems control fanout¶
| Mechanism | Where it cuts |
|---|---|
| Schema-aware path elimination (concepts/in-memory-schema-metadata-graph) | Drop paths the schema doesn't permit before reads |
| Type filters in the traversal API | Drop neighbours of wrong type at the storage layer |
| Edge-limit per traversal step | Bound fanout explicitly in the API (Netflix setEdgeLimit(N)) |
| Property-key pushdown | Reduce per-edge data, not edge count, but lowers the per-fanout cost |
| Bounded per-source-node degree | Operational invariant; keeps worst-case fanout finite |
| Parallelism | Doesn't reduce fanout; reduces wall-clock latency at cost of concurrency |
Seen in¶
- sources/2026-05-29-netflix-high-throughput-graph-abstraction-at-netflix-part-i — canonical wiki disclosure; named as a first-class operational metric and the structural predictor of the RDG 2-hop p90<50ms / sometimes-100ms latency profile.
- sources/2026-05-19-airbnb-scaling-identity-graph-unified-knowledge-graph-infrastructure — Airbnb's identity graph (7B nodes, 11B edges) uses 4–8 hop traversals for Trust & Safety. High-fanout nodes cause disproportionate P95/P99 growth. Mitigated via parallel multi-slice fetching and client-side query rewriting to avoid non-batched backend operations on fan-out-heavy paths.