Quantization Loss Re-Learning Method
This work addresses the challenge of efficient neural network deployment for practitioners by incrementally improving quantization techniques for LSTM models.
The paper tackles the problem of quantizing LSTM gate parameters without significant performance loss by proposing a Quantization Loss Re-Learn Method, which reduces the F1 score drop to only 0.7% on a Named Entity Recognition dataset compared to the baseline.
In order to quantize the gate parameters of the LSTM (Long Short-Term Memory) neural network model with almost no recognition performance degraded, a new quantization method named Quantization Loss Re-Learn Method is proposed in this paper. The method does lossy quantization on gate parameters during training iterations, and the weight parameters learn to offset the loss of gate parameters quantization by adjusting the gradient in back propagation during weight parameters optimization. We proved the effectiveness of this method through theoretical derivation and experiments. The gate parameters had been quantized to 0, 0.5, 1 three values, and on the Named Entity Recognition dataset, the F1 score of the model with the new quantization method on gate parameters decreased by only 0.7% compared to the baseline model.