Discriminative calibration: Check Bayesian computation from simulations and flexible classifier
This work addresses the challenge of verifying Bayesian computations for researchers and practitioners in statistics and machine learning, offering a more powerful and interpretable alternative to existing methods, though it is incremental as it builds on SBC.
The authors tackled the problem of checking Bayesian computation accuracy by proposing a flexible classification approach to replace the rank-based simulation-based calibration (SBC), which addresses SBC's drawbacks like ad-hoc test statistics and lack of interpretable divergence measures. The method shows higher statistical power and provides an interpretable divergence measure from classification accuracy, validated with numerical and real data experiments.
To check the accuracy of Bayesian computations, it is common to use rank-based simulation-based calibration (SBC). However, SBC has drawbacks: The test statistic is somewhat ad-hoc, interactions are difficult to examine, multiple testing is a challenge, and the resulting p-value is not a divergence metric. We propose to replace the marginal rank test with a flexible classification approach that learns test statistics from data. This measure typically has a higher statistical power than the SBC rank test and returns an interpretable divergence measure of miscalibration, computed from classification accuracy. This approach can be used with different data generating processes to address likelihood-free inference or traditional inference methods like Markov chain Monte Carlo or variational inference. We illustrate an automated implementation using neural networks and statistically-inspired features, and validate the method with numerical and real data experiments.