LGCVMar 28, 2022

To Fold or Not to Fold: a Necessary and Sufficient Condition on Batch-Normalization Layers Folding

arXiv:2203.14646v111 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck for deploying complex models on edge devices, offering an incremental improvement over existing BN folding methods.

The paper tackles the suboptimal removal of Batch-Normalization layers in deep neural networks for edge deployment by providing a necessary and sufficient condition for folding, resulting in dramatically reduced inference time.

Batch-Normalization (BN) layers have become fundamental components in the evermore complex deep neural network architectures. Such models require acceleration processes for deployment on edge devices. However, BN layers add computation bottlenecks due to the sequential operation processing: thus, a key, yet often overlooked component of the acceleration process is BN layers folding. In this paper, we demonstrate that the current BN folding approaches are suboptimal in terms of how many layers can be removed. We therefore provide a necessary and sufficient condition for BN folding and a corresponding optimal algorithm. The proposed approach systematically outperforms existing baselines and allows to dramatically reduce the inference time of deep neural networks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes