CVAILGAug 19, 2021

Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

arXiv:2108.08728v2335 citationsHas Code
Originality Highly original
AI Analysis

This work addresses the challenge of improving attention mechanisms for fine-grained visual recognition, which is important for applications like image categorization and re-identification, but it appears incremental as it builds on existing attention methods with a novel causal approach.

The paper tackles the problem of learning effective visual attention for fine-grained recognition tasks by introducing a counterfactual attention learning method based on causal inference, resulting in consistent improvements across benchmarks such as fine-grained image categorization, person re-identification, and vehicle re-identification.

Attention mechanism has demonstrated great potential in fine-grained visual recognition tasks. In this paper, we present a counterfactual attention learning method to learn more effective attention based on causal inference. Unlike most existing methods that learn visual attention based on conventional likelihood, we propose to learn the attention with counterfactual causality, which provides a tool to measure the attention quality and a powerful supervisory signal to guide the learning process. Specifically, we analyze the effect of the learned visual attention on network prediction through counterfactual intervention and maximize the effect to encourage the network to learn more useful attention for fine-grained image recognition. Empirically, we evaluate our method on a wide range of fine-grained recognition tasks where attention plays a crucial role, including fine-grained image categorization, person re-identification, and vehicle re-identification. The consistent improvement on all benchmarks demonstrates the effectiveness of our method. Code is available at https://github.com/raoyongming/CAL

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes