LG MLJul 13, 2020

Regularized linear autoencoders recover the principal components, eventually

Xuchan Bao, James Lucas, Sushant Sachdeva, Roger Grosse

arXiv:2007.06731v214.039 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a foundational gap in representation learning theory for neural networks, though it is incremental as it builds on prior regularization schemes.

The paper tackles the problem of understanding representation convergence in linear autoencoders, showing that with proper regularization, they can learn ordered principal components, but convergence is slow due to ill-conditioning, and a modified gradient descent method speeds it up empirically.

Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs). We show that when trained with proper regularization, LAEs can directly learn the optimal representation -- ordered, axis-aligned principal components. We analyze two such regularization schemes: non-uniform $\ell_2$ regularization and a deterministic variant of nested dropout [Rippel et al, ICML' 2014]. Though both regularization schemes converge to the optimal representation, we show that this convergence is slow due to ill-conditioning that worsens with increasing latent dimension. We show that the inefficiency of learning the optimal representation is not inevitable -- we present a simple modification to the gradient descent update that greatly speeds up convergence empirically.

View on arXiv PDF Code

Similar