Asymptotically exact data augmentation: models, properties and algorithms
This work addresses a foundational problem in machine learning and statistics for researchers and practitioners using inference methods like MCMC, though it appears incremental by unifying existing models.
The paper tackles the challenge of systematically introducing auxiliary variables in data augmentation to preserve target distributions and enable efficient inference, proposing a unified framework called asymptotically exact data augmentation (AXDA) that shows statistical benefits and efficient algorithms.
Data augmentation, by the introduction of auxiliary variables, has become an ubiquitous technique to improve convergence properties, simplify the implementation or reduce the computational time of inference methods such as Markov chain Monte Carlo ones. Nonetheless, introducing appropriate auxiliary variables while preserving the initial target probability distribution and offering a computationally efficient inference cannot be conducted in a systematic way. To deal with such issues, this paper studies a unified framework, coined asymptotically exact data augmentation (AXDA), which encompasses both well-established and more recent approximate augmented models. In a broader perspective, this paper shows that AXDA models can benefit from interesting statistical properties and yield efficient inference algorithms. In non-asymptotic settings, the quality of the proposed approximation is assessed with several theoretical results. The latter are illustrated on standard statistical problems. Supplementary materials including computer code for this paper are available online.