Synergizing Contrastive Learning and Optimal Transport for 3D Point Cloud Domain Adaptation
This addresses domain discrepancies in 3D point clouds for applications like robotics and virtual reality, representing an incremental improvement over existing methods.
The paper tackles unsupervised domain adaptation for 3D point cloud classification by combining multimodal contrastive learning and optimal transport to reduce cross-domain shift, achieving state-of-the-art performance with a 4-12% margin on GraspNetPC-10 and best average results on PointDA-10.
Recently, the fundamental problem of unsupervised domain adaptation (UDA) on 3D point clouds has been motivated by a wide variety of applications in robotics, virtual reality, and scene understanding, to name a few. The point cloud data acquisition procedures manifest themselves as significant domain discrepancies and geometric variations among both similar and dissimilar classes. The standard domain adaptation methods developed for images do not directly translate to point cloud data because of their complex geometric nature. To address this challenge, we leverage the idea of multimodality and alignment between distributions. We propose a new UDA architecture for point cloud classification that benefits from multimodal contrastive learning to get better class separation in both domains individually. Further, the use of optimal transport (OT) aims at learning source and target data distributions jointly to reduce the cross-domain shift and provide a better alignment. We conduct a comprehensive empirical study on PointDA-10 and GraspNetPC-10 and show that our method achieves state-of-the-art performance on GraspNetPC-10 (with approx 4-12% margin) and best average performance on PointDA-10. Our ablation studies and decision boundary analysis also validate the significance of our contrastive learning module and OT alignment.