A patch-based architecture for multi-label classification from single label annotations
This addresses the problem of limited annotation data in multi-label classification for computer vision researchers, though it appears incremental as it builds on existing patch and attention mechanisms.
The paper tackles multi-label classification with only single-label annotations by proposing a patch-based architecture and a strategy for estimating negative examples, achieving the ability to train from scratch without pre-training unlike related methods.
In this paper, we propose a patch-based architecture for multi-label classification problems where only a single positive label is observed in images of the dataset. Our contributions are twofold. First, we introduce a light patch architecture based on the attention mechanism. Next, leveraging on patch embedding self-similarities, we provide a novel strategy for estimating negative examples and deal with positive and unlabeled learning problems. Experiments demonstrate that our architecture can be trained from scratch, whereas pre-training on similar databases is required for related methods from the literature.