CVMMJan 23, 2021

Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating Noisy Samples and Utilizing Hard Ones

arXiv:2101.09412v1
Originality Incremental advance
AI Analysis

This addresses the challenge of fine-grained recognition without expert labeling, though it is incremental as it builds on existing web-supervised methods.

The paper tackles the problem of training fine-grained visual recognition models from web images, which often contain label noise and hard examples, by proposing an approach that removes irrelevant noisy samples and utilizes hard ones during training, achieving superior performance over state-of-the-art methods on three datasets.

Labeling objects at a subordinate level typically requires expert knowledge, which is not always available when using random annotators. As such, learning directly from web images for fine-grained recognition has attracted broad attention. However, the presence of label noise and hard examples in web images are two obstacles for training robust fine-grained recognition models. Therefore, in this paper, we propose a novel approach for removing irrelevant samples from real-world web images during training, while employing useful hard examples to update the network. Thus, our approach can alleviate the harmful effects of irrelevant noisy web images and hard examples to achieve better performance. Extensive experiments on three commonly used fine-grained datasets demonstrate that our approach is far superior to current state-of-the-art web-supervised methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes