CVJan 15, 2022

OneDConv: Generalized Convolution For Transform-Invariant Representation

Tong Zhang, Haohan Weng, Ke Yi, C. L. Philip Chen

arXiv:2201.05781v11.4

Originality Incremental advance

AI Analysis

This addresses the problem of limited robustness in CNNs for vision tasks in complex real-world scenarios, representing a novel method for a known bottleneck rather than a foundational advancement.

The paper tackles the lack of transform-invariant properties in CNNs by proposing OneDConv, a generalized convolutional operator that dynamically adjusts kernels based on input features, improving robustness and generalization without performance loss on standard images, as shown by outperforming baselines on multiple benchmarks.

Convolutional Neural Networks (CNNs) have exhibited their great power in a variety of vision tasks. However, the lack of transform-invariant property limits their further applications in complicated real-world scenarios. In this work, we proposed a novel generalized one dimension convolutional operator (OneDConv), which dynamically transforms the convolution kernels based on the input features in a computationally and parametrically efficient manner. The proposed operator can extract the transform-invariant features naturally. It improves the robustness and generalization of convolution without sacrificing the performance on common images. The proposed OneDConv operator can substitute the vanilla convolution, thus it can be incorporated into current popular convolutional architectures and trained end-to-end readily. On several popular benchmarks, OneDConv outperforms the original convolution operation and other proposed models both in canonical and distorted images.

View on arXiv PDF

Similar