CONCEPT Cited by 1 source
Augmented Inverse Probability Weighting (AIPW)¶
Definition¶
AIPW is a doubly-robust observational causal estimator that combines two nuisance models:
- A propensity model — the probability of exposure (treatment
level) given observed context:
e(x) = P(T=t | X=x). - An outcome model — the expected outcome given exposure and
context:
m(x, t) = E[Y | X=x, T=t].
The AIPW estimator for the average treatment effect on the outcome is, schematically,
— an outcome-model baseline, augmented by an inverse-probability-weighted residual term. Published framing in Chernozhukov et al. 2021.
Why "doubly robust"¶
The estimator is consistent (unbiased in the limit) if either
the propensity model or the outcome model is correctly
specified. Formally: if e(x) is right, the IPW term alone is
consistent; if m(x, t) is right, the outcome-model term alone is
consistent; AIPW gets both and needs only one.
That is a substantial risk reduction over either individual approach. IPW alone is notoriously unstable when propensities are near 0 or 1 (inverse explodes). Outcome regression alone is sensitive to functional-form misspecification in the outcome surface. AIPW smooths between the two — when the outcome model is approximately right, the IPW augmentation is a small correction; when it's wrong but propensities are right, the IPW correction dominates and rescues the estimate.
What AIPW does not fix¶
- Unobserved confounders. Doubly robust is not omniscient; if a confounder is missing from both models' covariate set, AIPW is biased. Doubly robust only helps against functional-form misspecification, not variable-omission misspecification.
- Positivity violations. If some contexts have essentially zero probability of treatment (or zero probability of control), no amount of re-weighting produces a trustworthy counterfactual.
- Measurement error in the exposure. If
Tis a noisy proxy for the real intervention, both models are fitting the proxy.
Why Lyft uses it for Step 2 of surrogacy¶
In Lyft's surrogacy framework, Step 2 estimates the causal effect of short-term negative user experiences (wait time, surge, cancellations, driver earnings, idleness) on future outcomes (future rides, retention, driver hours). The treatment variable is the user-level exposure level to negative experience, varying naturally across users, times, and places. Lyft:
"We use Augmented Inverse Probability Weighting (AIPW, Chernozhukov et.al, 2021), a doubly robust causal estimator combining (i) a propensity model for exposure (the likelihood of facing a given level of negative user experience, given context) and (ii) outcome models for future metrics, conditional on confounders. This yields average treatment effects for negative user experience."
The per-exposure-level treatment effects are summarised into a surrogacy index that scales short-term exposure to long-term impact.
Verification¶
Lyft verifies the AIPW estimates using user-split experiments that perturb user-level negative experiences and compare the model's predicted changes in future outcomes to the experimentally-observed lifts. The user-split shape works at the individual level because market mediation is small within the randomised population — the market sees treated + control mixed, so individual-level outcome differences reflect the mediator's direct effect on the user's own future behaviour (which is exactly what Step 2 is trying to estimate).
Seen in¶
- Lyft — Beyond A/B Testing (2026-03-25) — canonical wiki instance. Lyft's Foundational Models team uses AIPW as the Step 2 estimator in its surrogacy framework for marketplace long-term-effect estimation. Verified with user-split experiments.
Related¶
- concepts/surrogacy-causal-inference — the framework AIPW is the Step 2 estimator for.
- concepts/residualized-regression — the Step 1 estimator whose output distribution AIPW consumes.
- concepts/user-split-experiment — the verification experiment for AIPW estimates.
- patterns/surrogacy-two-step-ltv-estimation — the composed pattern.