LGMEMLMar 6, 2024

Sample size planning for conditional counterfactual mean estimation with a K-armed randomized experiment

arXiv:2403.04039v1
Originality Synthesis-oriented
AI Analysis

This addresses sample size planning for causal inference in subgroups, which is incremental as it builds on existing methods for simultaneous inference.

The paper tackles the problem of determining sample sizes for K-armed randomized experiments to estimate conditional counterfactual expectations in data-driven subgroups, showing that the recommended sample size relates to the number of inferences and can be inverted to assess feasible treatment arms or partition complexity, with evaluation on a large public dataset.

We cover how to determine a sufficiently large sample size for a $K$-armed randomized experiment in order to estimate conditional counterfactual expectations in data-driven subgroups. The sub-groups can be output by any feature space partitioning algorithm, including as defined by binning users having similar predictive scores or as defined by a learned policy tree. After carefully specifying the inference target, a minimum confidence level, and a maximum margin of error, the key is to turn the original goal into a simultaneous inference problem where the recommended sample size to offset an increased possibility of estimation error is directly related to the number of inferences to be conducted. Given a fixed sample size budget, our result allows us to invert the question to one about the feasible number of treatment arms or partition complexity (e.g. number of decision tree leaves). Using policy trees to learn sub-groups, we evaluate our nominal guarantees on a large publicly-available randomized experiment test data set.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes