LGCVNov 5, 2020

Conflicting Bundles: Adapting Architectures Towards the Improved Training of Deep Neural Networks

arXiv:2011.02956v17 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of optimizing neural network architectures for improved training efficiency and performance in computer vision tasks, though it appears incremental as it builds on existing architecture design methods.

The paper tackles the problem of identifying and removing layers in neural networks that decrease test accuracy, introducing a theory and metric to detect such layers early in training and an algorithm that automatically removes them, achieving competitive accuracy while reducing memory consumption and inference time.

Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models, this identification is done as early as at the beginning of training. In the worst-case, such a layer could lead to a network that can not be trained at all. More precisely, we identified those layers that worsen the performance because they produce conflicting training bundles as we show in our novel theoretical analysis, complemented by our extensive empirical studies. Based on these findings, a novel algorithm is introduced to remove performance decreasing layers automatically. Architectures found by this algorithm achieve a competitive accuracy when compared against the state-of-the-art architectures. While keeping such high accuracy, our approach drastically reduces memory consumption and inference time for different computer vision tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes