NELGFeb 7, 2019

Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks

arXiv:1902.08673v131 citations
AI Analysis

This addresses overfitting and model compression for deep learning practitioners, offering an incremental improvement over standard dropout.

The paper tackles overfitting in deep neural networks by proposing Ising-Dropout, an adaptive regularization method that uses Ising energy to drop units during training, achieving compression of up to 41.18% on MNIST and 55.86% on Fashion-MNIST while maintaining competitive classification performance.

Overfitting is a major problem in training machine learning models, specifically deep neural networks. This problem may be caused by imbalanced datasets and initialization of the model parameters, which conforms the model too closely to the training data and negatively affects the generalization performance of the model for unseen data. The original dropout is a regularization technique to drop hidden units randomly during training. In this paper, we propose an adaptive technique to wisely drop the visible and hidden units in a deep neural network using Ising energy of the network. The preliminary results show that the proposed approach can keep the classification performance competitive to the original network while eliminating optimization of unnecessary network parameters in each training cycle. The dropout state of units can also be applied to the trained (inference) model. This technique could compress the network in terms of number of parameters up to 41.18% and 55.86% for the classification task on the MNIST and Fashion-MNIST datasets, respectively.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes