LGFeb 10, 2022

Quantization in Layer's Input is Matter

arXiv:2202.05137v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient model compression for deployment in resource-constrained environments, though it appears incremental as it builds on existing quantization techniques.

The paper tackles the problem of quantization in neural networks by demonstrating that quantizing layer inputs is more critical for minimizing loss than quantizing parameters, and presents an algorithm based on input quantization error that outperforms Hessian-based mixed precision methods.

In this paper, we will show that the quantization in layer's input is more important than parameters' quantization for loss function. And the algorithm which is based on the layer's input quantization error is better than hessian-based mixed precision layout algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes