On regularization for a convolutional kernel in neural networks
This work addresses a known bottleneck in deep learning for improving stability and generalization in CNNs, but it is incremental as it builds on existing regularization techniques.
The authors tackled the problem of exploding/vanishing gradients and poor generalization in convolutional neural networks by proposing a penalty function to constrain the singular values of convolutional kernels around 1, demonstrating effectiveness through numerical examples.
Convolutional neural network is an important model in deep learning. To avoid exploding/vanishing gradient problems and to improve the generalizability of a neural network, it is desirable to have a convolution operation that nearly preserves the norm, or to have the singular values of the transformation matrix corresponding to a convolutional kernel bounded around $1$. We propose a penalty function that can be used in the optimization of a convolutional neural network to constrain the singular values of the transformation matrix around $1$. We derive an algorithm to carry out the gradient descent minimization of this penalty function in terms of convolution kernels. Numerical examples are presented to demonstrate the effectiveness of the method.