CV MMApr 27, 2021

Dual Transformer for Point Cloud Analysis

Xian-Feng Han, Yi-Fei Jin, Hui-Xian Cheng, Guo-Qiang Xiao

arXiv:2104.13044v113.5100 citations

Originality Incremental advance

AI Analysis

This work addresses point cloud understanding for 3D vision applications, presenting an incremental improvement over existing transformer-based methods.

The paper tackles point cloud analysis by proposing a Dual Transformer Network (DTNet) that uses dual attention mechanisms to capture contextual dependencies, achieving highly competitive performance in 3D classification and segmentation tasks.

Following the tremendous success of transformer in natural language processing and image understanding tasks, in this paper, we present a novel point cloud representation learning architecture, named Dual Transformer Network (DTNet), which mainly consists of Dual Point Cloud Transformer (DPCT) module. Specifically, by aggregating the well-designed point-wise and channel-wise multi-head self-attention models simultaneously, DPCT module can capture much richer contextual dependencies semantically from the perspective of position and channel. With the DPCT module as a fundamental component, we construct the DTNet for performing point cloud analysis in an end-to-end manner. Extensive quantitative and qualitative experiments on publicly available benchmarks demonstrate the effectiveness of our proposed transformer framework for the tasks of 3D point cloud classification and segmentation, achieving highly competitive performance in comparison with the state-of-the-art approaches.

View on arXiv PDF

Similar