CVMar 2, 2023

BPT: Binary Point Cloud Transformer for Place Recognition

Zhixing Hou, Yuzhang Shang, Tian Gao, Yan Yan

arXiv:2303.01166v12.83 citationsh-index: 15

Originality Highly original

AI Analysis

This work addresses the challenge of deploying point cloud transformers on mobile or embedded devices for online applications like place recognition, offering a more efficient solution.

The paper tackles the problem of high memory and computation costs in point cloud transformers for place recognition in robotics by proposing a binary point cloud transformer, reducing a 32-bit model to 1-bit with 56.1% size reduction and 34.1% FLOPs reduction while achieving 93.28% top @1% recall on the Oxford RobotCar dataset.

Place recognition, an algorithm to recognize the re-visited places, plays the role of back-end optimization trigger in a full SLAM system. Many works equipped with deep learning tools, such as MLP, CNN, and transformer, have achieved great improvements in this research field. Point cloud transformer is one of the excellent frameworks for place recognition applied in robotics, but with large memory consumption and expensive computation, it is adverse to widely deploy the various point cloud transformer networks in mobile or embedded devices. To solve this issue, we propose a binary point cloud transformer for place recognition. As a result, a 32-bit full-precision model can be reduced to a 1-bit model with less memory occupation and faster binarized bitwise operations. To our best knowledge, this is the first binary point cloud transformer that can be deployed on mobile devices for online applications such as place recognition. Experiments on several standard benchmarks demonstrate that the proposed method can get comparable results with the corresponding full-precision transformer model and even outperform some full-precision deep learning methods. For example, the proposed method achieves 93.28% at the top @1% and 85.74% at the top @1% on the Oxford RobotCar dataset in terms of the metric of the average recall rate. Meanwhile, the size and floating point operations of the model with the same transformer structure reduce 56.1% and 34.1% respectively from original precision to binary precision.

View on arXiv PDF

Similar