Skip to content

CONCEPT Cited by 1 source

Forward and reverse edge index

A forward and reverse edge index is a graph-storage technique where every edge is written twice into the link namespace — once keyed by the source node (the forward index) and once keyed by the destination node (the reverse index). Either-direction traversal then becomes a single partition range read, with the cost paid as a doubling of the edge-link write rate.

Why two indexes

Adjacency-list storage is intrinsically directional: if the link namespace partitions by source-node-id, then "give me everyone connected to D" (a reverse traversal) requires an O(|E|) full scan unless the system maintains a second index keyed by target-node-id.

Forward link namespace                Reverse link namespace
  partition: source                     partition: target
  clustering: target                    clustering: source

  S₁ → [T₁, T₂, T₃, ...]                 T₁ → [S₁, S₂, ...]
  S₂ → [T₁, T₃, ...]                     T₂ → [S₁, ...]
                                         T₃ → [S₁, S₂, ...]

Either direction is now O(neighbours), not O(|E|).

Each new edge in a property graph causes two edge-link writes (one forward, one reverse). For a system handling write amplification across multiple namespaces this is acceptable when:

  • The link records are small (just the link identity + last-write timestamp — no property bag).
  • Edge-property writes are kept in a separate namespace and use a direction-agnostic identifier so they happen once, not twice.
  • Multi-namespace write atomicity is handled by an entropy-repair mechanism (Kafka retry pipeline in Netflix Graph Abstraction).

If the property bag had been stored in the link record itself, a forward + reverse index would double the property-storage cost and any property update would need to update two records atomically — both undesirable. Splitting links from properties (patterns/separate-edge-links-from-properties) is the prerequisite that makes forward+reverse indexes practical at scale.

Mechanism

A bidirectional edge (profile, linked_to, device) produces two records in the link namespace:

Index Partition Clustering Direction
Forward profile-A device-X OUT
Reverse device-X profile-A IN

Edge properties are written once to the property namespace under the lex-sorted-and- concatenated identifier.

Read paths enabled

Query Path
All devices for profile-A range read on forward index, partition profile-A
All profiles for device-X range read on reverse index, partition device-X
Properties of edge (profile-A, device-X) point read on property namespace at lex-sorted-concat ID

Seen in

  • sources/2026-05-29-netflix-high-throughput-graph-abstraction-at-netflix-part-icanonical wiki disclosure: "edge indexes are separated into forward and reverse indexes to support traversals in either direction." Each link is written to both indexes. Combined with edge-link/property split + direction-agnostic edge id, edge-property writes happen once and either direction is single-range-read.
Last updated · 542 distilled / 1,571 read