Skip to content

SYSTEM Cited by 5 sources

Dash Relevance Ranker

Definition

Dash Relevance Ranker is the learning-to-rank model that scores each candidate document against a query inside Dash's unified search index and produces the top-K ordering passed to the answering LLM. The 2026-02-26 Dropbox Tech post names it as XGBoost-class (gradient-boosted trees) — "trained using machine learning techniques such as XGBoost rather than manually tuned rules" (Source: sources/2026-02-26-dropbox-using-llms-to-amplify-human-labeling-dash-search).

Previously described indirectly on systems/dash-search-index as the "multiple ranking passes; per-query + per-user relevance combining lexical match, vector similarity, and graph-derived signals. Personalized and ACL'd to you." (Source: sources/2026-01-28-dropbox-knowledge-graphs-mcp-dspy-dash). This page is the primary landing for the ranker itself.

Where it sits

Pipeline (Dash retrieval, annotated):

query
Dash Search Index (hybrid BM25 + dense vectors + knowledge-bundles)
  │  returns candidate set
Dash Relevance Ranker (XGBoost; per-(query,doc) features) ◄── features from Dash Feature Store
  │  scores + orders
Top-K → answering LLM's context window

Training signal

The ranker is trained on graded 1–5 relevance labels (concepts/relevance-labeling) over (query, document) pairs. The labels are not hand-produced at scale — they come from the human-calibrated LLM labeling pipeline:

  1. Small human-labeled seed set (internal, non-sensitive data only).
  2. LLM judge calibrated against the seed set via MSE on the 1–5 scale (range 0–16).
  3. Calibrated judge produces hundreds of thousands to millions of labels.
  4. Labels train XGBoost on (query, doc) features.

Production NDCG is the model-quality metric; the ranker is iterated by measuring NDCG on held-out judge-labeled slices.

Features (inferred)

Not exhaustively enumerated in any single source. Named signals across the Dash posts:

Why XGBoost, not an LLM, at query time

Explicit framing in the labeling post:

"Using LLMs directly at query time to replace traditional ranking models is not currently feasible due to context window limitations and latency constraints. Instead, Dash uses LLMs offline to generate high-quality training data."

The LLM is the teacher; XGBoost is the student that runs at serving time. Classical training-vs-serving split in ML systems.

Relationship to the labeling pipeline

The ranker's quality is bounded by the quality of its relevance labels. Three levers in the labeling pipeline feed ranker quality directly:

All three are described in the 2026-02-26 labeling post; the 2026-01-28 transcript adds the DSPy flywheel that closes the loop with automated prompt tuning.

Cross-modal expansion

Current ranker: text-centric (documents, messages, snippets). Dash's forward plans (from the post): extend to images, video, messages, chat. Each modality encodes relevance differently and may need its own feature extractors + sub-model. The labeling pipeline is explicitly positioned as the shared mechanism that scales across modalities:

"Human-calibrated LLM evaluation provides a shared mechanism for adapting relevance judgments across modalities without rebuilding labeling pipelines or redefining evaluation criteria from scratch."

Caveats

  • Feature list incomplete. Dropbox has not published a fully-enumerated feature spec.
  • Training cadence not disclosed. No information on retraining frequency, online vs offline, or A/B rollout methodology for new ranker versions.
  • No latency numbers for the ranker itself. The feature-store budget is p95 25–35ms for feature fetches; how much of the remaining sub-100ms per-query budget the XGBoost model consumes is not stated.
  • Multi-pass architecture hinted, not described. "Multiple ranking passes" appears in the 2026-01-28 transcript — likely coarse + fine re-ranker stages — but the stage boundary isn't specified.
  • Not confirmed as literal XGBoost. The 2026-02-26 post says "such as XGBoost" — the actual production model may be a related gradient-boosted-tree implementation (LightGBM, CatBoost, custom), not stock XGBoost.

Seen in

Last updated · 200 distilled / 1,178 read