CVFeb 20, 2018

Do deep nets really need weight decay and dropout?

arXiv:1802.07042v324 citations
Originality Incremental advance
AI Analysis

This challenges common practices in deep learning regularization, potentially simplifying model training for computer vision tasks.

The paper investigates whether weight decay and dropout are necessary for object recognition in deep neural networks, finding that they may not be needed when sufficient data augmentation is used.

The impressive success of modern deep neural networks on computer vision tasks has been achieved through models of very large capacity compared to the number of available training examples. This overparameterization is often said to be controlled with the help of different regularization techniques, mainly weight decay and dropout. However, since these techniques reduce the effective capacity of the model, typically even deeper and wider architectures are required to compensate for the reduced capacity. Therefore, there seems to be a waste of capacity in this practice. In this paper we build upon recent research that suggests that explicit regularization may not be as important as widely believed and carry out an ablation study that concludes that weight decay and dropout may not be necessary for object recognition if enough data augmentation is introduced.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes