A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off
This addresses the challenge of deploying neural networks in resource-constrained environments by providing theoretical insights into the trade-off between quantization and depth, though it is incremental as it builds on existing mean-field techniques.
The paper tackles the problem of quantization degrading signal propagation in deep neural networks at initialization, deriving schemes to maximize it and obtaining a closed-form equation showing maximal trainable depth scales as L_max ∝ N^1.82 with quantization levels N.
Reducing the precision of weights and activation functions in neural network training, with minimal impact on performance, is essential for the deployment of these models in resource-constrained environments. We apply mean-field techniques to networks with quantized activations in order to evaluate the degree to which quantization degrades signal propagation at initialization. We derive initialization schemes which maximize signal propagation in such networks and suggest why this is helpful for generalization. Building on these results, we obtain a closed form implicit equation for $L_{\max}$, the maximal trainable depth (and hence model capacity), given $N$, the number of quantization levels in the activation function. Solving this equation numerically, we obtain asymptotically: $L_{\max}\propto N^{1.82}$.