Fast Fourier Transformation for Optimizing Convolutional Neural Networks in Object Recognition
This work addresses computational efficiency for researchers and practitioners in computer vision, but it is incremental as it applies an existing method (FFT) to a known architecture (U-Net) on a specific dataset.
The paper tackled the problem of high computational costs in convolutional neural networks for object recognition by using Fast Fourier Transformation to reduce convolution time, achieving a reduction from 600-700 ms/step to 400-500 ms/step and improving accuracy as measured by Intersection over Union.
This paper proposes to use Fast Fourier Transformation-based U-Net (a refined fully convolutional networks) and perform image convolution in neural networks. Leveraging the Fast Fourier Transformation, it reduces the image convolution costs involved in the Convolutional Neural Networks (CNNs) and thus reduces the overall computational costs. The proposed model identifies the object information from the images. We apply the Fast Fourier transform algorithm on an image data set to obtain more accessible information about the image data, before segmenting them through the U-Net architecture. More specifically, we implement the FFT-based convolutional neural network to improve the training time of the network. The proposed approach was applied to publicly available Broad Bioimage Benchmark Collection (BBBC) dataset. Our model demonstrated improvement in training time during convolution from $600-700$ ms/step to $400-500$ ms/step. We evaluated the accuracy of our model using Intersection over Union (IoU) metric showing significant improvements.