CLAug 20, 2015

Auto-Sizing Neural Networks: With Applications to n-gram Language Models

arXiv:1508.05051v147 citations
Originality Incremental advance
AI Analysis

This addresses the issue of network design complexity for researchers and practitioners in natural language processing, though it is incremental as it builds on existing regularization techniques.

The paper tackles the problem of manually selecting the number of hidden units in neural networks by introducing an automatic pruning method using ℓ∞,1 and ℓ2,1 regularization, applied to n-gram language models where it maintains perplexity and in machine translation where smaller models retain performance improvements.

Neural networks have been shown to improve performance across a range of natural-language tasks. However, designing and training them can be complicated. Frequently, researchers resort to repeated experimentation to pick optimal settings. In this paper, we address the issue of choosing the correct number of units in hidden layers. We introduce a method for automatically adjusting network size by pruning out hidden units through $\ell_{\infty,1}$ and $\ell_{2,1}$ regularization. We apply this method to language modeling and demonstrate its ability to correctly choose the number of hidden units while maintaining perplexity. We also include these models in a machine translation decoder and show that these smaller neural models maintain the significant improvements of their unpruned versions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes