Learning Multimodal Fixed-Point Weights using Gradient Descent
This addresses the need for efficient neural network deployment on less powerful hardware, though it appears incremental as it builds on existing quantization methods.
The paper tackles the problem of high computational complexity in deep neural networks by proposing a gradient-based optimization strategy for low-bit fixed-point quantization, achieving state-of-the-art performance with 2-bit weights.
Due to their high computational complexity, deep neural networks are still limited to powerful processing units. To promote a reduced model complexity by dint of low-bit fixed-point quantization, we propose a gradient-based optimization strategy to generate a symmetric mixture of Gaussian modes (SGM) where each mode belongs to a particular quantization stage. We achieve 2-bit state-of-the-art performance and illustrate the model's ability for self-dependent weight adaptation during training.