CVAISep 17, 2024

Scale-Invariant Object Detection by Adaptive Convolution with Unified Global-Local Context

arXiv:2410.05274v21 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses scale-invariant object detection for computer vision applications, offering a novel method to enhance detection of minute objects, though it appears incremental as it builds upon the efficientDet model.

The paper tackles the problem of multi-scale object detection, particularly the failure to detect smaller objects due to loss of dense features in CNNs, by proposing SAC-Net with adaptive atrous convolution and global-local context, achieving significant accuracy improvements over state-of-the-art models on benchmark datasets.

Dense features are important for detecting minute objects in images. Unfortunately, despite the remarkable efficacy of the CNN models in multi-scale object detection, CNN models often fail to detect smaller objects in images due to the loss of dense features during the pooling process. Atrous convolution addresses this issue by applying sparse kernels. However, sparse kernels often can lose the multi-scale detection efficacy of the CNN model. In this paper, we propose an object detection model using a Switchable (adaptive) Atrous Convolutional Network (SAC-Net) based on the efficientDet model. A fixed atrous rate limits the performance of the CNN models in the convolutional layers. To overcome this limitation, we introduce a switchable mechanism that allows for dynamically adjusting the atrous rate during the forward pass. The proposed SAC-Net encapsulates the benefits of both low-level and high-level features to achieve improved performance on multi-scale object detection tasks, without losing the dense features. Further, we apply a depth-wise switchable atrous rate to the proposed network, to improve the scale-invariant features. Finally, we apply global context on the proposed model. Our extensive experiments on benchmark datasets demonstrate that the proposed SAC-Net outperforms the state-of-the-art models by a significant margin in terms of accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes