Kervolutional Neural Networks
This addresses a fundamental problem in computer vision by enhancing model capacity for tasks like image recognition, though it appears incremental as it builds on existing CNN frameworks.
The paper tackles the limitation of CNNs by introducing kervolution, a non-linear convolution operation using kernel functions to capture higher-order feature interactions without extra parameters, achieving higher accuracy and faster convergence than baseline CNNs in experiments.
Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks. However, little effort has been devoted to establishing convolution in non-linear space. Existing works mainly leverage on the activation layers, which can only provide point-wise non-linearity. To solve this problem, a new operation, kervolution (kernel convolution), is introduced to approximate complex behaviors of human perception systems leveraging on the kernel trick. It generalizes convolution, enhances the model capacity, and captures higher order interactions of features, via patch-wise kernel functions, but without introducing additional parameters. Extensive experiments show that kervolutional neural networks (KNN) achieve higher accuracy and faster convergence than baseline CNN.