Skip to content

CONCEPT Cited by 1 source

Determinantal Point Process (DPP)

Definition

A Determinantal Point Process (DPP) is a probability distribution over subsets of a ground set that assigns higher probability to subsets of items that are both high-quality and diverse. For a ground set of n items, DPP is parametrized by a positive-semi-definite (PSD) kernel matrix L ∈ ℝ^{n×n}:

  • Diagonal L_{ii} — relevance or quality of item i.
  • Off-diagonal L_{ij} = L_{ji} — similarity between items i and j.

Probability of selecting subset Y: P(Y) ∝ det(L_Y) where L_Y is the submatrix restricted to Y. The determinant formulation naturally penalises redundant items — similar items contribute smaller volumes to the determinant.

In production recommender systems, DPP is used for slate selection with diversity (Kulesza & Taskar 2012, arXiv:1207.6083; Wilhelm et al., Practical Diversified Recommendations on YouTube with DPPs, CIKM 2018).

Factored kernel form at production scale

Industry practice decomposes the kernel to keep it tractable:

L = f(Λ) · g(S) · f(Λᵀ)

where Λ is the diagonal matrix of relevance scores, f(·) is a monotonic increasing element-wise transformation, and g(S) is the similarity-matrix contribution built from learned embeddings (e.g. GraphSage) or categorical taxonomy (Source: sources/2026-04-07-pinterest-evolution-of-multi-objective-optimization-at-pinterest-home).

Greedy MAP inference

Exact DPP sampling is expensive; production systems use greedy MAP inference — iteratively pick the item that maximally increases log det(L_Y ∪ {i}) — backed by Cholesky-style updates for incremental determinant computation. Complexity trade-offs (from the SSD paper) motivate alternatives like SSD in certain regimes.

Use at Pinterest — Home Feed V1 (2021-2024)

Pinterest used DPP as "the main component" of its V1 Home Feed diversification layer (2021-2024) — a node in the backend blending chain that performed the time-intensive diversity decomposition over candidates.

Load-bearing Pinterest result: removing DPP dropped user time-spent impression by over 2% after the first week. Canonical datum for "feed-level diversity is a long-term engagement lever" (Source: sources/2026-04-07-pinterest-evolution-of-multi-objective-optimization-at-pinterest-home).

Operational caveats (from Pinterest's DPP → SSD migration)

Pinterest moved away from DPP in early 2025 because SSD offered:

  • Lower greedy-inference complexity — avoids Cholesky-style decompositions over the similarity matrix.
  • Better numerical stability — DPP requires "positive semi-definite enforcement, log-determinants, and fragile numerical issues common in DPP (e.g., jittered kernels, Cholesky failures)."
  • Cleaner PyTorch implementation — SSD is linear-algebra blocks; DPP is harder to express as a tensor-native kernel.
  • Lower serving latency — enabling richer pairwise-similarity signals to be used in the kernel.

See patterns/ssd-over-dpp-diversification.

Caveats

  • Kernel PSD enforcement — the similarity component g(S) must be PSD; achieved via kernel tricks (RBF, polynomial) or spectral clipping. Failure modes are silent and fragile in production.
  • Not intrinsically position-adaptive — DPP optimises over the slate globally; approaches like SSD handle top-down rendering more naturally.
  • Computationally heavy at large n — motivates candidate pre-pruning in practice.
  • MAP vs sampling — greedy MAP is common in production but loses DPP's probabilistic guarantees; exact sampling algorithms exist but are rarely used live.

Seen in

Last updated · 319 distilled / 1,201 read