CV LGMay 30, 2022

Pooling Revisited: Your Receptive Field is Suboptimal

Dong-Hwan Jang, Sanghyeok Chu, Joonhyuk Kim, Bohyung Han

arXiv:2205.15254v26.517 citationsh-index: 57

Originality Incremental advance

AI Analysis

This work addresses a fundamental issue in neural network design for computer vision, offering a method to enhance model performance by optimizing receptive fields, though it is incremental as it builds on existing resizing modules.

The authors tackled the problem of suboptimal receptive field sizes and shapes in neural networks by proposing DynOPool, a learnable pooling operation that optimizes receptive fields end-to-end, resulting in improved performance on image classification and semantic segmentation datasets.

The size and shape of the receptive field determine how the network aggregates local information and affect the overall performance of a model considerably. Many components in a neural network, such as kernel sizes and strides for convolution and pooling operations, influence the configuration of a receptive field. However, they still rely on hyperparameters, and the receptive fields of existing models result in suboptimal shapes and sizes. Hence, we propose a simple yet effective Dynamically Optimized Pooling operation, referred to as DynOPool, which optimizes the scale factors of feature maps end-to-end by learning the desirable size and shape of its receptive field in each layer. Any kind of resizing modules in a deep neural network can be replaced by the operations with DynOPool at a minimal cost. Also, DynOPool controls the complexity of a model by introducing an additional loss term that constrains computational cost. Our experiments show that the models equipped with the proposed learnable resizing module outperform the baseline networks on multiple datasets in image classification and semantic segmentation.

View on arXiv PDF

Similar