LGOCMLAug 25, 2020

Stochastic Markov Gradient Descent and Training Low-Bit Neural Networks

arXiv:2008.11117v25 citations
AI Analysis

This work addresses the challenge of reducing memory usage during training for large neural networks, which is an incremental improvement in the domain of neural network quantization.

The paper tackles the problem of training quantized neural networks under severe memory constraints by introducing Stochastic Markov Gradient Descent (SMGD), a discrete optimization method, and demonstrates its effectiveness with theoretical guarantees and encouraging numerical results.

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes