LGSYNov 30, 2020

Robust error bounds for quantised and pruned neural networks

arXiv:2012.00138v24 citations
AI Analysis

This work provides a method to quantify the performance degradation of pruned and quantized neural networks, which is crucial for deploying these models in safety-critical systems.

This paper introduces a semi-definite program to bound the worst-case error caused by pruning or quantizing neural networks. The method provides robust error bounds for various neural network structures and nonlinear activation functions across specified input sets.

With the rise of smartphones and the internet-of-things, data is increasingly getting generated at the edge on local, personal devices. For privacy, latency and energy saving reasons, this shift is causing machine learning algorithms to move towards decentralisation with the data and algorithms stored, and even trained, locally on devices. The device hardware becomes the main bottleneck for model capability in this set-up, creating a need for slimmed down, more efficient neural networks. Neural network pruning and quantisation are two methods that have been developed for this, with both approaches demonstrating impressive results in reducing the computational cost without sacrificing significantly on model performance. However, the understanding behind these reduction methods remains underdeveloped. To address this issue, a semi-definite program is introduced to bound the worst-case error caused by pruning or quantising a neural network. The method can be applied to many neural network structures and nonlinear activation functions with the bounds holding robustly for all inputs in specified sets. It is hoped that the computed bounds will provide certainty to the performance of these algorithms when deployed on safety-critical systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes