LGOCNov 27, 2023

SensLI: Sensitivity-Based Layer Insertion for Neural Networks

arXiv:2311.15995v2h-index: 3Has Code
Originality Incremental advance
AI Analysis

This addresses the tedious manual architecture tuning problem for neural network practitioners, though it appears to be an incremental improvement on existing adaptive architecture methods.

The authors tackled the problem of manual network architecture tuning by proposing SensLI, a sensitivity-based layer insertion method that automatically adds layers during training. Their approach achieved improved training loss and test error compared to fixed architectures while reducing computational effort.

The training of neural networks requires tedious and often manual tuning of the network architecture. We propose a systematic approach to inserting new layers during the training process. Our method eliminates the need to choose a fixed network size before training, is numerically inexpensive to execute and applicable to various architectures including fully connected feedforward networks, ResNets and CNNs. Our technique borrows ideas from constrained optimization and is based on first-order sensitivity information of the loss function with respect to the virtual parameters that additional layers, if inserted, would offer. In numerical experiments, our proposed sensitivity-based layer insertion technique (SensLI) exhibits improved performance on training loss and test error, compared to training on a fixed architecture, and reduced computational effort in comparison to training the extended architecture from the beginning. Our code is available on https://github.com/mathemml/SensLI.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes