LGNEDec 21, 2013

Do Deep Nets Really Need to be Deep?

arXiv:1312.6184v72236 citations
Originality Incremental advance
AI Analysis

This challenges the prevailing assumption in machine learning that depth is essential for high performance, potentially enabling more efficient and simpler models for tasks like speech recognition.

The paper tackles the problem of whether deep neural networks are necessary for achieving state-of-the-art performance, showing that shallow feed-forward networks can learn complex functions and achieve similar accuracies to deep models, with comparable parameter counts, as demonstrated on the TIMIT phoneme recognition task.

Currently, deep neural networks are the state of the art on problems such as speech recognition and computer vision. In this extended abstract, we show that shallow feed-forward networks can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models. Moreover, in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model. We evaluate our method on the TIMIT phoneme recognition task and are able to train shallow fully-connected nets that perform similarly to complex, well-engineered, deep convolutional architectures. Our success in training shallow neural nets to mimic deeper models suggests that there probably exist better algorithms for training shallow feed-forward nets than those currently available.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes