Robust Bayesian Inference for Discrete Outcomes with the Total Variation Distance
This work addresses robustness issues in statistical inference for discrete data, which is important for researchers and practitioners dealing with misspecified models, though it appears incremental as it builds on existing discrepancy-based methods.
The authors tackled the problem of model misspecification in discrete outcome models, such as zero-inflation or overdispersion, by introducing a robust Bayesian approach using the Total Variation Distance, which significantly improved predictive performance on simulated and real-world data.
Models of discrete-valued outcomes are easily misspecified if the data exhibit zero-inflation, overdispersion or contamination. Without additional knowledge about the existence and nature of this misspecification, model inference and prediction are adversely affected. Here, we introduce a robust discrepancy-based Bayesian approach using the Total Variation Distance (TVD). In the process, we address and resolve two challenges: First, we study convergence and robustness properties of a computationally efficient estimator for the TVD between a parametric model and the data-generating mechanism. Second, we provide an efficient inference method adapted from Lyddon et al. (2019) which corresponds to formulating an uninformative nonparametric prior directly over the data-generating mechanism. Lastly, we empirically demonstrate that our approach is robust and significantly improves predictive performance on a range of simulated and real world data.