LG MLJun 30, 2020

Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data

Francesco Tonolini, Pablo G. Moreno, Andreas Damianou, Roderick Murray-Smith

arXiv:2006.16938v13.31 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of handling corrupted data with uncertainty for applications like data imputation and de-noising, though it is incremental as it builds on variational methods.

The paper tackles the problem of unsupervised recovery of corrupted data, such as missing values and noise, by proposing a probabilistic method that recovers accurate posteriors to explore reconstruction uncertainty, demonstrating superior performance in imputation and de-noising tasks on real datasets and higher classification accuracy after imputation.

We propose a new probabilistic method for unsupervised recovery of corrupted data. Given a large ensemble of degraded samples, our method recovers accurate posteriors of clean values, allowing the exploration of the manifold of possible reconstructed data and hence characterising the underlying uncertainty. In this setting, direct application of classical variational methods often gives rise to collapsed densities that do not adequately explore the solution space. Instead, we derive our novel reduced entropy condition approximate inference method that results in rich posteriors. We test our model in a data recovery task under the common setting of missing values and noise, demonstrating superior performance to existing variational methods for imputation and de-noising with different real data sets. We further show higher classification accuracy after imputation, proving the advantage of propagating uncertainty to downstream tasks with our model.

View on arXiv PDF

Similar