LGNEDec 20, 2013

Unsupervised Pretraining Encourages Moderate-Sparseness

arXiv:1312.5813v22 citations
Originality Synthesis-oriented
AI Analysis

This provides a new explanation for a common practice in deep learning, but it is incremental as it builds on existing theories of regularization and optimization.

The paper tackles the problem of understanding why unsupervised pretraining improves neural network performance, showing that it leads to moderate sparseness in hidden unit activations, with experimental support from MNIST and Birdsong datasets.

It is well known that direct training of deep neural networks will generally lead to poor results. A major progress in recent years is the invention of various pretraining methods to initialize network parameters and it was shown that such methods lead to good prediction performance. However, the reason for the success of pretraining has not been fully understood, although it was argued that regularization and better optimization play certain roles. This paper provides another explanation for the effectiveness of pretraining, where we show pretraining leads to a sparseness of hidden unit activation in the resulting neural networks. The main reason is that the pretraining models can be interpreted as an adaptive sparse coding. Compared to deep neural network with sigmoid function, our experimental results on MNIST and Birdsong further support this sparseness observation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes