CVMar 18, 2022

Towards Robust 2D Convolution for Reliable Visual Recognition

arXiv:2203.09790v11 citationsh-index: 52
Originality Incremental advance
AI Analysis

This addresses the problem of unreliable visual recognition in CNNs for applications requiring robustness to image distortions, though it is an incremental improvement over existing methods.

The paper tackles the vulnerability of 2D convolution in CNNs to image corruptions and adversarial samples by designing a robust alternative called RConv-MK, which uses learnable kernels and normalized soft thresholding to improve feature extraction, validated through experiments on clean, corrupted, and adversarial images.

2D convolution (Conv2d), which is responsible for extracting features from the input image, is one of the key modules of a convolutional neural network (CNN). However, Conv2d is vulnerable to image corruptions and adversarial samples. It is an important yet rarely investigated problem that whether we can design a more robust alternative of Conv2d for more reliable feature extraction. In this paper, inspired by the recently developed learnable sparse transform that learns to convert the CNN features into a compact and sparse latent space, we design a novel building block, denoted by RConv-MK, to strengthen the robustness of extracted convolutional features. Our method leverages a set of learnable kernels of different sizes to extract features at different frequencies and employs a normalized soft thresholding operator to adaptively remove noises and trivial features at different corruption levels. Extensive experiments on clean images, corrupted images as well as adversarial samples validate the effectiveness of the proposed robust module for reliable visual recognition. The source codes are enclosed in the submission.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes