LGMLNov 14, 2018

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

arXiv:1811.05850v58 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses overfitting for deep learning practitioners, offering an incremental improvement by harmonizing with existing techniques like Batch Normalization.

The paper tackles overfitting in deep learning by proposing Drop-Activation, a regularization method that randomly sets activation functions to identity during training and uses a deterministic network for testing, resulting in improved performance on image classification datasets like CIFAR-10, CIFAR-100, SVHN, EMNIST, and ImageNet.

Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly. Our theoretical analyses support the regularization effect of Drop-Activation as implicit parameter reduction and verify its capability to be used together with Batch Normalization (Ioffe and Szegedy 2015). The experimental results on CIFAR-10, CIFAR-100, SVHN, EMNIST, and ImageNet show that Drop-Activation generally improves the performance of popular neural network architectures for the image classification task. Furthermore, as a regularizer Drop-Activation can be used in harmony with standard training and regularization techniques such as Batch Normalization and Auto Augment (Cubuk et al. 2019). The code is available at \url{https://github.com/LeungSamWai/Drop-Activation}.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes