CVOct 27, 2023

Learning to recognize occluded and small objects with partial inputs

arXiv:2310.18517v13 citationsh-index: 14
Originality Highly original
AI Analysis

This addresses the challenge of occluded and small object recognition in computer vision, which is incremental as it builds on existing multi-label recognition methods.

The paper tackles the problem of recognizing occluded and small objects in multi-label image recognition by proposing Masked Supervised Learning (MSL), which learns context-based representations and models label co-occurrence, achieving competitive performance on standard benchmarks.

Recognizing multiple objects in an image is challenging due to occlusions, and becomes even more so when the objects are small. While promising, existing multi-label image recognition models do not explicitly learn context-based representations, and hence struggle to correctly recognize small and occluded objects. Intuitively, recognizing occluded objects requires knowledge of partial input, and hence context. Motivated by this intuition, we propose Masked Supervised Learning (MSL), a single-stage, model-agnostic learning paradigm for multi-label image recognition. The key idea is to learn context-based representations using a masked branch and to model label co-occurrence using label consistency. Experimental results demonstrate the simplicity, applicability and more importantly the competitive performance of MSL against previous state-of-the-art methods on standard multi-label image recognition benchmarks. In addition, we show that MSL is robust to random masking and demonstrate its effectiveness in recognizing non-masked objects. Code and pretrained models are available on GitHub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes