Incorporating Image Gradients as Secondary Input Associated with Input Image to Improve the Performance of the CNN Model
This addresses the problem of enhancing generalization in vision tasks for researchers and practitioners, but it is incremental as it builds on existing CNN methods by adding gradient inputs.
The paper tackles the problem of improving CNN performance by proposing a new architecture that incorporates image gradients as a secondary input alongside the original image, sharing layers to use the same number of parameters. The results show superior performance on datasets like MNIST, CIFAR10, and CIFAR100 compared to benchmark CNNs using single inputs.
CNN is very popular neural network architecture in modern days. It is primarily most used tool for vision related task to extract the important features from the given image. Moreover, CNN works as a filter to extract the important features using convolutional operation in distinct layers. In existing CNN architectures, to train the network on given input, only single form of given input is fed to the network. In this paper, new architecture has been proposed where given input is passed in more than one form to the network simultaneously by sharing the layers with both forms of input. We incorporate image gradient as second form of the input associated with the original input image and allowing both inputs to flow in the network using same number of parameters to improve the performance of the model for better generalization. The results of the proposed CNN architecture, applying on diverse set of datasets such as MNIST, CIFAR10 and CIFAR100 show superior result compared to the benchmark CNN architecture considering inputs in single form.