CVNov 13, 2023

PadChannel: Improving CNN Performance through Explicit Padding Encoding

arXiv:2311.07623v2h-index: 1Has Code
AI Analysis

This addresses a subtle issue in CNN design for image classification, but it is incremental as it builds on existing architectures with minor gains.

The paper tackled the problem of ambiguous padding in CNNs by proposing PadChannel, a method that encodes padding status as an additional input channel, resulting in small performance improvements and reduced variances on ImageNet-1K classification with marginal computational cost increases.

In convolutional neural networks (CNNs), padding plays a pivotal role in preserving spatial dimensions throughout the layers. Traditional padding techniques do not explicitly distinguish between the actual image content and the padded regions, potentially causing CNNs to incorrectly interpret the boundary pixels or regions that resemble boundaries. This ambiguity can lead to suboptimal feature extraction. To address this, we propose PadChannel, a novel padding method that encodes padding statuses as an additional input channel, enabling CNNs to easily distinguish genuine pixels from padded ones. By incorporating PadChannel into several prominent CNN architectures, we observed small performance improvements and notable reductions in the variances on the ImageNet-1K image classification task at marginal increases in the computational cost. The source code is available at https://github.com/AussieSeaweed/pad-channel

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes