LGMLMay 23, 2018

Approximate Random Dropout

arXiv:1805.08939v212 citations
Originality Incremental advance
AI Analysis

This addresses the energy and time inefficiency in DNN training for AI practitioners, though it is incremental as it builds on existing dropout methods.

The paper tackles the problem of high computational cost in training deep neural networks by proposing Approximate Random Dropout, which uses regular dropout patterns to reduce training time by 20%-77% on MLP and 19%-60% on LSTM with marginal accuracy drop.

The training phases of Deep neural network~(DNN) consumes enormous processing time and energy. Compression techniques utilizing the sparsity of DNNs can effectively accelerate the inference phase of DNNs. However, it can be hardly used in the training phase because the training phase involves dense matrix-multiplication using General Purpose Computation on Graphics Processors (GPGPU), which endorse regular and structural data layout. In this paper, we propose the Approximate Random Dropout that replaces the conventional random dropout of neurons and synapses with a regular and predefined patterns to eliminate the unnecessary computation and data access. To compensate the potential performance loss we develop a SGD-based Search Algorithm to produce the distribution of dropout patterns. We prove our approach is statistically equivalent to the previous dropout method. Experiments results on MLP and LSTM using well-known benchmarks show that the proposed Approximate Random Dropout can reduce the training time by $20\%$-$77\%$ ($19\%$-$60\%$) when dropout rate is $0.3$-$0.7$ on MLP (LSTM) with marginal accuracy drop.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes