LGDIS-NNMLMay 18, 2023

High-dimensional Asymptotics of Denoising Autoencoders

arXiv:2305.11041v125 citations
Originality Incremental advance
AI Analysis

This work addresses denoising in high-dimensional data for machine learning practitioners, offering theoretical insights and validation on real datasets, though it appears incremental as it builds on existing autoencoder architectures.

The paper tackles the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection in a high-dimensional limit, providing closed-form expressions for the denoising mean-squared test error and quantitatively showing the advantage of this architecture over autoencoders without skip connections.

We address the problem of denoising data from a Gaussian mixture using a two-layer non-linear autoencoder with tied weights and a skip connection. We consider the high-dimensional limit where the number of training samples and the input dimension jointly tend to infinity while the number of hidden units remains bounded. We provide closed-form expressions for the denoising mean-squared test error. Building on this result, we quantitatively characterize the advantage of the considered architecture over the autoencoder without the skip connection that relates closely to principal component analysis. We further show that our results accurately capture the learning curves on a range of real data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes