PAM:Point-wise Attention Module for 6D Object Pose Estimation
This work addresses a key bottleneck in 6D pose estimation for robotics and AR/VR applications, though it appears incremental as it builds on attention mechanisms for feature enhancement.
The paper tackled the problem of insufficient feature extraction in 6D object pose estimation by proposing a Point Attention Module (PAM) that efficiently extracts powerful features from RGB-D data using geometric and channel attention paths, resulting in outperforming existing methods on benchmarks like YCB Video and LineMod.
6D pose estimation refers to object recognition and estimation of 3D rotation and 3D translation. The key technology for estimating 6D pose is to estimate pose by extracting enough features to find pose in any environment. Previous methods utilized depth information in the refinement process or were designed as a heterogeneous architecture for each data space to extract feature. However, these methods are limited in that they cannot extract sufficient feature. Therefore, this paper proposes a Point Attention Module that can efficiently extract powerful feature from RGB-D. In our Module, attention map is formed through a Geometric Attention Path(GAP) and Channel Attention Path(CAP). In GAP, it is designed to pay attention to important information in geometric information, and CAP is designed to pay attention to important information in Channel information. We show that the attention module efficiently creates feature representations without significantly increasing computational complexity. Experimental results show that the proposed method outperforms the existing methods in benchmarks, YCB Video and LineMod. In addition, the attention module was applied to the classification task, and it was confirmed that the performance significantly improved compared to the existing model.