LGAINEApr 26, 2023

Guaranteed Quantization Error Computation for Neural Network Model Compression

arXiv:2304.13812v15 citationsh-index: 32
Originality Incremental advance
AI Analysis

This work addresses the need for reliable error estimation in model compression for embedded systems, though it appears incremental as it builds on existing quantization and analysis techniques.

The paper tackles the problem of computing guaranteed output errors for neural network compression via quantization, developing a method that builds a merged network from original and quantized versions and applies optimization and reachability analysis to compute exact error bounds, validated through a numerical example.

Neural network model compression techniques can address the computation issue of deep neural networks on embedded devices in industrial systems. The guaranteed output error computation problem for neural network compression with quantization is addressed in this paper. A merged neural network is built from a feedforward neural network and its quantized version to produce the exact output difference between two neural networks. Then, optimization-based methods and reachability analysis methods are applied to the merged neural network to compute the guaranteed quantization error. Finally, a numerical example is proposed to validate the applicability and effectiveness of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes