LGJul 25, 2022

On the benefits of non-linear weight updates

arXiv:2207.12505v1h-index: 3
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in DNN optimization for researchers and practitioners, but it is incremental as it adapts existing optimizers.

The authors tackled the problem of suboptimal Signal-to-Noise Ratio (SNR) in DNN weight configurations by proposing a non-linear function applied to gradients before updates, which improves performance across various problem types.

Recent work has suggested that the generalisation performance of a DNN is related to the extent to which the Signal-to-Noise Ratio is optimised at each of the nodes. In contrast, Gradient Descent methods do not always lead to SNR-optimal weight configurations. One way to improve SNR performance is to suppress large weight updates and amplify small weight updates. Such balancing is already implicit in some common optimizers, but we propose an approach that makes this explicit. The method applies a non-linear function to gradients prior to making DNN parameter updates. We investigate the performance with such non-linear approaches. The result is an adaptation to existing optimizers that improves performance for many problem types.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes