LGCVMay 21, 2017

Shake-Shake regularization

arXiv:1705.07485v2396 citationsHas Code
Originality Highly original
AI Analysis

This addresses overfitting for deep learning practitioners, offering a novel regularization technique with strong performance gains.

The paper tackled overfitting in deep learning by introducing shake-shake regularization, which replaces standard summation in multi-branch networks with a stochastic affine combination, achieving test errors of 2.86% on CIFAR-10 and 15.85% on CIFAR-100.

The method introduced in this paper aims at helping deep learning practitioners faced with an overfit problem. The idea is to replace, in a multi-branch network, the standard summation of parallel branches with a stochastic affine combination. Applied to 3-branch residual networks, shake-shake regularization improves on the best single shot published results on CIFAR-10 and CIFAR-100 by reaching test errors of 2.86% and 15.85%. Experiments on architectures without skip connections or Batch Normalization show encouraging results and open the door to a large set of applications. Code is available at https://github.com/xgastaldi/shake-shake

Code Implementations13 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes