DMF-Net: Image-Guided Point Cloud Completion with Dual-Channel Modality Fusion and Shape-Aware Upsampling Transformer
This work addresses the challenge of accurately completing 3D point clouds from partial data and images, which is important for applications like robotics and augmented reality, but it is incremental as it builds on existing multimodal fusion approaches.
The paper tackles the problem of single-view image-guided point cloud completion by proposing DMF-Net, which uses dual-channel modality fusion and a shape-aware upsampling transformer to recover dense and complete point clouds, outperforming state-of-the-art methods on the ShapeNet-ViPC dataset.
In this paper we study the task of a single-view image-guided point cloud completion. Existing methods have got promising results by fusing the information of image into point cloud explicitly or implicitly. However, given that the image has global shape information and the partial point cloud has rich local details, We believe that both modalities need to be given equal attention when performing modality fusion. To this end, we propose a novel dual-channel modality fusion network for image-guided point cloud completion(named DMF-Net), in a coarse-to-fine manner. In the first stage, DMF-Net takes a partial point cloud and corresponding image as input to recover a coarse point cloud. In the second stage, the coarse point cloud will be upsampled twice with shape-aware upsampling transformer to get the dense and complete point cloud. Extensive quantitative and qualitative experimental results show that DMF-Net outperforms the state-of-the-art unimodal and multimodal point cloud completion works on ShapeNet-ViPC dataset.