CVJul 27, 2022

Object-ABN: Learning to Generate Sharp Attention Maps for Action Recognition

Tomoya Nitta, Tsubasa Hirakawa, Hironobu Fujiyoshi, Toru Tamaki

arXiv:2207.13306v11.4h-index: 24

Originality Incremental advance

AI Analysis

This work addresses the need for more intuitive visual explanations in action recognition, though it is incremental as it extends an existing method.

The paper tackled the problem of generating blurry attention maps in action recognition by proposing Object-ABN, which uses instance segmentation and new losses to produce sharper maps, resulting in clearer maps and improved classification performance on UCF101 and SSv2 datasets.

In this paper we propose an extension of the Attention Branch Network (ABN) by using instance segmentation for generating sharper attention maps for action recognition. Methods for visual explanation such as Grad-CAM usually generate blurry maps which are not intuitive for humans to understand, particularly in recognizing actions of people in videos. Our proposed method, Object-ABN, tackles this issue by introducing a new mask loss that makes the generated attention maps close to the instance segmentation result. Further the PC loss and multiple attention maps are introduced to enhance the sharpness of the maps and improve the performance of classification. Experimental results with UCF101 and SSv2 shows that the generated maps by the proposed method are much clearer qualitatively and quantitatively than those of the original ABN.

View on arXiv PDF

Similar