ME LG EM MLMay 20, 2022

What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment

arXiv:2205.10327v210.835 citationsh-index: 37Has Code

Originality Incremental advance

AI Analysis

This work provides a method for decision-makers in fields like A/B testing and policy evaluation to assess potential harms without relying on unverifiable assumptions, though it is incremental in improving robustness over existing bounding approaches.

The paper addresses the problem of bounding the fraction of individuals negatively affected by a treatment when counterfactuals are unobservable, deriving sharp bounds that can be tightened with covariate stratification and developing a robust inference algorithm for credible conclusions, as demonstrated in simulations and a case study on career counseling.

The fundamental problem of causal inference -- that we never observe counterfactuals -- prevents us from identifying how many might be negatively affected by a proposed intervention. If, in an A/B test, half of users click (or buy, or watch, or renew, etc.), whether exposed to the standard experience A or a new one B, hypothetically it could be because the change affects no one, because the change positively affects half the user population to go from no-click to click while negatively affecting the other half, or something in between. While unknowable, this impact is clearly of material importance to the decision to implement a change or not, whether due to fairness, long-term, systemic, or operational considerations. We therefore derive the tightest-possible (i.e., sharp) bounds on the fraction negatively affected (and other related estimands) given data with only factual observations, whether experimental or observational. Naturally, the more we can stratify individuals by observable covariates, the tighter the sharp bounds. Since these bounds involve unknown functions that must be learned from data, we develop a robust inference algorithm that is efficient almost regardless of how and how fast these functions are learned, remains consistent when some are mislearned, and still gives valid conservative bounds when most are mislearned. Our methodology altogether therefore strongly supports credible conclusions: it avoids spuriously point-identifying this unknowable impact, focusing on the best bounds instead, and it permits exceedingly robust inference on these. We demonstrate our method in simulation studies and in a case study of career counseling for the unemployed.

View on arXiv PDF Code

Similar