Learning a smooth kernel regularizer for convolutional neural networks
This work addresses the need for more data-efficient training methods in computer vision, though it is incremental as it builds on existing regularization techniques.
The paper tackles the problem of high data requirements for training deep neural networks by proposing a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights, improving sample complexity and outperforming L2 regularization in visual recognition tasks.
Modern deep neural networks require a tremendous amount of data to train, often needing hundreds or thousands of labeled examples to learn an effective representation. For these networks to work with less data, more structure must be built into their architectures or learned from previous experience. The learned weights of convolutional neural networks (CNNs) trained on large datasets for object recognition contain a substantial amount of structure. These representations have parallels to simple cells in the primary visual cortex, where receptive fields are smooth and contain many regularities. Incorporating smoothness constraints over the kernel weights of modern CNN architectures is a promising way to improve their sample complexity. We propose a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights. The correlation parameters of this regularizer are learned from previous experience, yielding a method with a hierarchical Bayesian interpretation. We show that our correlated regularizer can help constrain models for visual recognition, improving over an L2 regularization baseline.