Cross-Task Benchmarking of CNN Architectures
This incremental work provides a cross-task benchmarking analysis for researchers in neural network design, focusing on dynamic CNNs for multiplexed data modalities.
The study compared dynamic CNN variants, including attention mechanisms and ODConv, across image classification, segmentation, and time series tasks, finding they consistently outperformed conventional CNNs in accuracy and efficiency, with ODConv excelling on complex images.
This project provides a comparative study of dynamic convolutional neural networks (CNNs) for various tasks, including image classification, segmentation, and time series analysis. Based on the ResNet-18 architecture, we compare five variants of CNNs: the vanilla CNN, the hard attention-based CNN, the soft attention-based CNN with local (pixel-wise) and global (image-wise) feature attention, and the omni-directional CNN (ODConv). Experiments on Tiny ImageNet, Pascal VOC, and the UCR Time Series Classification Archive illustrate that attention mechanisms and dynamic convolution methods consistently exceed conventional CNNs in accuracy, efficiency, and computational performance. ODConv was especially effective on morphologically complex images by being able to dynamically adjust to varying spatial patterns. Dynamic CNNs enhanced feature representation and cross-task generalization through adaptive kernel modulation. This project provides perspectives on advanced CNN design architecture for multiplexed data modalities and indicates promising directions in neural network engineering.