CVMar 22, 2016

Fully Convolutional Attention Networks for Fine-Grained Recognition

arXiv:1603.06765v4102 citations
Originality Highly original
AI Analysis

This addresses the problem of expensive part annotations and undefined parts in fine-grained recognition for computer vision applications, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles fine-grained recognition by introducing Fully Convolutional Attention Networks (FCANs), a reinforcement learning framework that localizes discriminative regions without part annotations, achieving competitive results on datasets like CUB-200-2011 and Food-101.

Fine-grained recognition is challenging due to its subtle local inter-class differences versus large intra-class variations such as poses. A key to address this problem is to localize discriminative parts to extract pose-invariant features. However, ground-truth part annotations can be expensive to acquire. Moreover, it is hard to define parts for many fine-grained classes. This work introduces Fully Convolutional Attention Networks (FCANs), a reinforcement learning framework to optimally glimpse local discriminative regions adaptive to different fine-grained domains. Compared to previous methods, our approach enjoys three advantages: 1) the weakly-supervised reinforcement learning procedure requires no expensive part annotations; 2) the fully-convolutional architecture speeds up both training and testing; 3) the greedy reward strategy accelerates the convergence of the learning. We demonstrate the effectiveness of our method with extensive experiments on four challenging fine-grained benchmark datasets, including CUB-200-2011, Stanford Dogs, Stanford Cars and Food-101.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes