CVMMApr 27, 2021

Dual Transformer for Point Cloud Analysis

arXiv:2104.13044v1100 citations
Originality Incremental advance
AI Analysis

This work addresses point cloud understanding for 3D vision applications, presenting an incremental improvement over existing transformer-based methods.

The paper tackles point cloud analysis by proposing a Dual Transformer Network (DTNet) that uses dual attention mechanisms to capture contextual dependencies, achieving highly competitive performance in 3D classification and segmentation tasks.

Following the tremendous success of transformer in natural language processing and image understanding tasks, in this paper, we present a novel point cloud representation learning architecture, named Dual Transformer Network (DTNet), which mainly consists of Dual Point Cloud Transformer (DPCT) module. Specifically, by aggregating the well-designed point-wise and channel-wise multi-head self-attention models simultaneously, DPCT module can capture much richer contextual dependencies semantically from the perspective of position and channel. With the DPCT module as a fundamental component, we construct the DTNet for performing point cloud analysis in an end-to-end manner. Extensive quantitative and qualitative experiments on publicly available benchmarks demonstrate the effectiveness of our proposed transformer framework for the tasks of 3D point cloud classification and segmentation, achieving highly competitive performance in comparison with the state-of-the-art approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes