CVAug 8, 2025

MCA: 2D-3D Retrieval with Noisy Labels via Multi-level Adaptive Correction and Alignment

Gui Zou, Chaofan Gan, Chern Hong Lim, Supavadee Aramvith, Weiyao Lin

arXiv:2508.06104v13.6h-index: 192025 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Originality Incremental advance

AI Analysis

This addresses robust retrieval for applications using 2D and 3D data with imperfect annotations, representing an incremental improvement over existing methods.

The paper tackles the problem of 2D-3D cross-modal retrieval with noisy labels by proposing a Multi-level Adaptive Correction and Alignment framework, achieving state-of-the-art performance on benchmarks.

With the increasing availability of 2D and 3D data, significant advancements have been made in the field of cross-modal retrieval. Nevertheless, the existence of imperfect annotations presents considerable challenges, demanding robust solutions for 2D-3D cross-modal retrieval in the presence of noisy label conditions. Existing methods generally address the issue of noise by dividing samples independently within each modality, making them susceptible to overfitting on corrupted labels. To address these issues, we propose a robust 2D-3D \textbf{M}ulti-level cross-modal adaptive \textbf{C}orrection and \textbf{A}lignment framework (MCA). Specifically, we introduce a Multimodal Joint label Correction (MJC) mechanism that leverages multimodal historical self-predictions to jointly model the modality prediction consistency, enabling reliable label refinement. Additionally, we propose a Multi-level Adaptive Alignment (MAA) strategy to effectively enhance cross-modal feature semantics and discrimination across different levels. Extensive experiments demonstrate the superiority of our method, MCA, which achieves state-of-the-art performance on both conventional and realistic noisy 3D benchmarks, highlighting its generality and effectiveness.

View on arXiv PDF

Similar