LG OCNov 27, 2023

SensLI: Sensitivity-Based Layer Insertion for Neural Networks

Leonie Kreis, Evelyn Herberg, Frederik Köhne, Anton Schiela, Roland Herzog

arXiv:2311.15995v22.0h-index: 3Has Code

Originality Incremental advance

AI Analysis

This addresses the tedious manual architecture tuning problem for neural network practitioners, though it appears to be an incremental improvement on existing adaptive architecture methods.

The authors tackled the problem of manual network architecture tuning by proposing SensLI, a sensitivity-based layer insertion method that automatically adds layers during training. Their approach achieved improved training loss and test error compared to fixed architectures while reducing computational effort.

The training of neural networks requires tedious and often manual tuning of the network architecture. We propose a systematic approach to inserting new layers during the training process. Our method eliminates the need to choose a fixed network size before training, is numerically inexpensive to execute and applicable to various architectures including fully connected feedforward networks, ResNets and CNNs. Our technique borrows ideas from constrained optimization and is based on first-order sensitivity information of the loss function with respect to the virtual parameters that additional layers, if inserted, would offer. In numerical experiments, our proposed sensitivity-based layer insertion technique (SensLI) exhibits improved performance on training loss and test error, compared to training on a fixed architecture, and reduced computational effort in comparison to training the extended architecture from the beginning. Our code is available on https://github.com/mathemml/SensLI.

View on arXiv PDF Code

Similar