CVOct 6, 2020

Rotate to Attend: Convolutional Triplet Attention Module

arXiv:2010.03045v2934 citationsHas Code
AI Analysis

This work addresses the need for efficient attention modules in computer vision, offering a plug-and-play solution that enhances classic backbone networks, though it is incremental as it builds on existing attention paradigms.

The paper tackles the problem of building efficient attention mechanisms for computer vision by introducing triplet attention, a lightweight module that captures cross-dimension interactions using a three-branch structure with rotation operations, achieving improved performance on image classification and object detection tasks with negligible computational overhead.

Benefiting from the capability of building inter-dependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension interaction using a three-branch structure. For an input tensor, triplet attention builds inter-dimensional dependencies by the rotation operation followed by residual transformations and encodes inter-channel and spatial information with negligible computational overhead. Our method is simple as well as efficient and can be easily plugged into classic backbone networks as an add-on module. We demonstrate the effectiveness of our method on various challenging tasks including image classification on ImageNet-1k and object detection on MSCOCO and PASCAL VOC datasets. Furthermore, we provide extensive in-sight into the performance of triplet attention by visually inspecting the GradCAM and GradCAM++ results. The empirical evaluation of our method supports our intuition on the importance of capturing dependencies across dimensions when computing attention weights. Code for this paper can be publicly accessed at https://github.com/LandskapeAI/triplet-attention

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes