Skip to content

CONCEPT Cited by 1 source

Surrogacy in causal inference

Definition

Surrogacy is the design stance of estimating the causal effect of an intervention on a long-term outcome by estimating it via a short-term mediator — the surrogate — whose relationship to the outcome can be modelled separately. The composition (intervention → surrogate → outcome) yields a long-term-effect estimate using mostly short-term data, on the assumption that the mediator fully captures the intervention's effect on the outcome.

The canonical trade-off: long-term outcomes are expensive to observe (long experiment windows, drift, attrition) and mediated through many noisy channels, so classical A/B testing to the outcome is slow and low-powered. Surrogacy replaces one hard estimation problem with two easier ones — intervention → surrogate, and surrogate → outcome — each observable on short horizons, each verifiable with its own purpose-built experiment.

The load-bearing assumption

"Market-mediated long-term effects are completely mediated by short-term [surrogate] experiences." (Lyft, 2026-03-25)

Any channel by which the intervention affects the outcome not through the chosen surrogate is invisible to the surrogacy estimator. Long-term brand effects, slow trust shifts, competitor responses, and macroeconomic feedback can all route around a short-term-user-experience surrogate. End-to-end verification experiments (see region-split) are the escape hatch for catching assumption violations, but only on the subset of interventions that get region-tested.

Two-step estimator shape

  1. Step 1: intervention → surrogate distribution. Estimate how the intervention shifts the distribution (not just the mean) of the surrogate. Lyft uses residualised regression on deviations from a learned baseline, verified by switch-back experiments.

  2. Step 2: surrogate → outcome. Estimate the causal effect of the surrogate on the long-term outcome using observational inference with a doubly-robust estimator — Lyft uses AIPW, verified by user-split experiments. The mapping is summarised by a surrogacy index that scales short-term exposure to long-term impact.

  3. Composition. Combine Step 1 and Step 2 to forecast the intervention's long-term outcome effect.

  4. End-to-end verification. Run a region-split experiment periodically to validate the composed forecast — the only experiment shape that can observe long-term + market-mediated channels together.

Why it's more than "chained regression"

Naive chaining is not surrogacy — surrogacy is a causal decomposition with a named mediation assumption. The assumption is testable (end-to-end experimentation) and falsifiable (a region-split result that disagrees with the composed forecast invalidates either the mediation assumption or the step estimators). Surrogate selection is itself a design choice: a good surrogate (i) is strongly affected by the intervention, (ii) strongly affects the outcome, (iii) fully mediates — i.e. conditioning on the surrogate makes intervention independent of outcome.

Seen in

  • Lyft — Beyond A/B Testing (2026-03-25) — canonical wiki instance. Lyft's Foundational Models team uses surrogacy to estimate the market-mediated component of the long-term effects of pricing / incentive decisions. Surrogate = distribution of short-term negative user experiences (wait time, surge, cancellations, driver earnings, idleness). Step 1 uses residualised regression + switch-back verification; Step 2 uses AIPW + user-split verification; end-to-end region-split is the final arbiter. See patterns/surrogacy-two-step-ltv-estimation for the composed pattern.
Last updated · 319 distilled / 1,201 read