LG AI NE MLJun 23, 2021

Should You Go Deeper? Optimizing Convolutional Neural Network Architectures without Training by Receptive Field Analysis

Mats L. Richter, Julius Schöning, Anna Wiedenroth, Ulf Krumnack

arXiv:2106.12307v28.418 citations

Originality Incremental advance

AI Analysis

This work addresses inefficiencies in CNN design for image-based tasks, offering a pre-training optimization approach that is incremental but could automate architecture design.

The paper tackles the problem of convolutional neural networks (CNN) being over-parameterized with unnecessary layers, leading to high resource usage and minimal performance gains, by proposing a method to identify and remove unproductive layers through receptive field analysis without training, resulting in optimized architectures for efficiency and explainability.

When optimizing convolutional neural networks (CNN) for a specific image-based task, specialists commonly overshoot the number of convolutional layers in their designs. By implication, these CNNs are unnecessarily resource intensive to train and deploy, with diminishing beneficial effects on the predictive performance. The features a convolutional layer can process are strictly limited by its receptive field. By layer-wise analyzing the size of the receptive fields, we can reliably predict sequences of layers that will not contribute qualitatively to the test accuracy in the given CNN architecture. Based on this analysis, we propose design strategies based on a so-called border layer. This layer allows to identify unproductive convolutional layers and hence to resolve these inefficiencies, optimize the explainability and the computational performance of CNNs. Since neither the strategies nor the analysis requires training of the actual model, these insights allow for a very efficient design process of CNN architectures, which might be automated in the future.

View on arXiv PDF

Similar