CVAug 14, 2022

Gradient Mask: Lateral Inhibition Mechanism Improves Performance in Artificial Neural Networks

Lei Jiang, Yongqing Liu, Shihai Xiao, Yansong Chua

arXiv:2208.06918v12.61 citationsh-index: 18

Originality Incremental advance

AI Analysis

This work addresses overfitting issues in neural networks for researchers and practitioners, though it appears incremental as it builds on existing backpropagation methods with a biologically inspired tweak.

The authors tackled the problem of overfitting in deep learning by proposing Gradient Mask, a method inspired by biological lateral inhibition to filter noise gradients during backpropagation, resulting in improved accuracy in CNNs under various conditions such as pruning and adversarial attacks.

Lateral inhibitory connections have been observed in the cortex of the biological brain, and has been extensively studied in terms of its role in cognitive functions. However, in the vanilla version of backpropagation in deep learning, all gradients (which can be understood to comprise of both signal and noise gradients) flow through the network during weight updates. This may lead to overfitting. In this work, inspired by biological lateral inhibition, we propose Gradient Mask, which effectively filters out noise gradients in the process of backpropagation. This allows the learned feature information to be more intensively stored in the network while filtering out noisy or unimportant features. Furthermore, we demonstrate analytically how lateral inhibition in artificial neural networks improves the quality of propagated gradients. A new criterion for gradient quality is proposed which can be used as a measure during training of various convolutional neural networks (CNNs). Finally, we conduct several different experiments to study how Gradient Mask improves the performance of the network both quantitatively and qualitatively. Quantitatively, accuracy in the original CNN architecture, accuracy after pruning, and accuracy after adversarial attacks have shown improvements. Qualitatively, the CNN trained using Gradient Mask has developed saliency maps that focus primarily on the object of interest, which is useful for data augmentation and network interpretability.

View on arXiv PDF

Similar