Modeling Sampling Distributions of Test Statistics with Autograd
This work addresses a specific challenge in statistical inference for researchers, but appears incremental as it compares to an established method without claiming broad breakthroughs.
The paper tackles the problem of accurately modeling the cumulative distribution function of test statistics for simulation-based inference with correct conditional coverage, and finds that using neural networks with autograd to approximate sampling distributions is a viable alternative to existing density-ratio methods.
Simulation-based inference methods that feature correct conditional coverage of confidence sets based on observations that have been compressed to a scalar test statistic require accurate modeling of either the p-value function or the cumulative distribution function (cdf) of the test statistic. If the model of the cdf, which is typically a deep neural network, is a function of the test statistic then the derivative of the neural network with respect to the test statistic furnishes an approximation of the sampling distribution of the test statistic. We explore whether this approach to modeling conditional 1-dimensional sampling distributions is a viable alternative to the probability density-ratio method, also known as the likelihood-ratio trick. Relatively simple, yet effective, neural network models are used whose predictive uncertainty is quantified through a variety of methods.