CONCEPT Cited by 1 source
Identity graph¶
An identity graph captures relationships between users and their associated entities (devices, emails, phone numbers, payment methods, IP addresses) in a graph structure. It serves aggregated insights enabling identity resolution (determining that multiple accounts belong to the same person) and relationship understanding (mapping how users connect to each other through shared attributes).
Primary use cases¶
- Fraud detection — detecting suspicious activity patterns by traversing shared-attribute edges between accounts.
- Linked account identification — finding accounts that share devices, payment methods, or contact information.
- Trust & Safety — risk scoring based on graph neighborhood properties (how many known-bad nodes are N hops away).
Design challenges¶
- Scale: Identity graphs grow fast (Airbnb: 7B nodes, 11B edges, +5M edges/day) because every user action can create new edges.
- Query complexity: Trust & Safety queries require multi-hop traversal (4–8 hops at Airbnb) to find indirect relationships.
- Long-tail latency: Graph density varies — hitting high-fanout nodes (hub nodes) during traversal causes disproportionate P95/P99 growth.
- Stability: Slow queries on dense subgraphs consume resources that affect other tenants/queries.
Seen in¶
- sources/2026-05-19-airbnb-scaling-identity-graph-unified-knowledge-graph-infrastructure — Airbnb's identity graph is one of their largest graph data products: 7B nodes, 11B edges, ~5M new edges/day. Serves Trust & Safety use cases with 4–8 hop queries. Migrated from third-party SaaS to an internal JanusGraph + DynamoDB platform for 10× write scalability and significant P99 latency reduction.
Related¶
- concepts/knowledge-graph — identity graph is a specialization
- concepts/graph-traversal-fanout — the latency challenge
- systems/airbnb-knowledge-graph-infrastructure — Airbnb's platform