CVMar 14, 2019

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition

arXiv:1903.06150v2432 citations
Originality Incremental advance
AI Analysis

This addresses the problem of recognizing subtle visual differences in images for applications like species or object classification, but it is incremental as it builds on existing attention-based methods.

The paper tackles fine-grained image recognition by learning discriminative features from hundreds of part proposals using a Trilinear Attention Sampling Network (TASN), achieving state-of-the-art performance on datasets like iNaturalist-2017, CUB-Bird, and Stanford-Cars.

Learning subtle yet discriminative features (e.g., beak and eyes for a bird) plays a significant role in fine-grained image recognition. Existing attention-based approaches localize and amplify significant parts to learn fine-grained details, which often suffer from a limited number of parts and heavy computational cost. In this paper, we propose to learn such fine-grained features from hundreds of part proposals by Trilinear Attention Sampling Network (TASN) in an efficient teacher-student manner. Specifically, TASN consists of 1) a trilinear attention module, which generates attention maps by modeling the inter-channel relationships, 2) an attention-based sampler which highlights attended parts with high resolution, and 3) a feature distiller, which distills part features into a global one by weight sharing and feature preserving strategies. Extensive experiments verify that TASN yields the best performance under the same settings with the most competitive approaches, in iNaturalist-2017, CUB-Bird, and Stanford-Cars datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes