CVJul 15, 2023

Improving Translation Invariance in Convolutional Neural Networks with Peripheral Prediction Padding

arXiv:2307.07725v11 citationsh-index: 13
Originality Highly original
AI Analysis

This addresses translation invariance issues in CNNs for computer vision tasks, offering a novel padding method that improves performance in semantic segmentation.

The paper tackled the problem of zero padding encoding absolute positional information in CNNs, which can harm performance, by proposing Peripheral Prediction Padding (PP-Pad) to learn padding values end-to-end, resulting in higher accuracy and translation invariance in semantic segmentation tasks.

Zero padding is often used in convolutional neural networks to prevent the feature map size from decreasing with each layer. However, recent studies have shown that zero padding promotes encoding of absolute positional information, which may adversely affect the performance of some tasks. In this work, a novel padding method called Peripheral Prediction Padding (PP-Pad) method is proposed, which enables end-to-end training of padding values suitable for each task instead of zero padding. Moreover, novel metrics to quantitatively evaluate the translation invariance of the model are presented. By evaluating with these metrics, it was confirmed that the proposed method achieved higher accuracy and translation invariance than the previous methods in a semantic segmentation task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes