NECVLGJul 3, 2012

Improving neural networks by preventing co-adaptation of feature detectors

arXiv:1207.0580v17999 citations
Originality Highly original
AI Analysis

This addresses the problem of overfitting for researchers and practitioners in machine learning, representing a novel method rather than an incremental improvement.

The paper tackles overfitting in large neural networks trained on small datasets by introducing dropout, which randomly omits half of the feature detectors during training, leading to significant improvements on benchmark tasks and setting new records in speech and object recognition.

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random "dropout" gives big improvements on many benchmark tasks and sets new records for speech and object recognition.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes