LG CVJul 5, 2025

Consistency-Aware Padding for Incomplete Multi-Modal Alignment Clustering Based on Self-Repellent Greedy Anchor Search

Shubin Ma, Liang Zhao, Mingdong Lu, Yifan Guo, Bo Xu

arXiv:2507.03917v14.19 citationsh-index: 3IJCAI

Originality Incremental advance

AI Analysis

It addresses incomplete and misaligned multimodal data, a common issue in real-world applications like sensor networks, but appears incremental as it builds on existing alignment and padding techniques.

The paper tackles the problem of filling missing data in multimodal datasets that are both imbalanced and misaligned, proposing a method that improves multimodal data fusion quality, with experimental results showing superiority over benchmarks.

Multimodal representation is faithful and highly effective in describing real-world data samples' characteristics by describing their complementary information. However, the collected data often exhibits incomplete and misaligned characteristics due to factors such as inconsistent sensor frequencies and device malfunctions. Existing research has not effectively addressed the issue of filling missing data in scenarios where multiview data are both imbalanced and misaligned. Instead, it relies on class-level alignment of the available data. Thus, it results in some data samples not being well-matched, thereby affecting the quality of data fusion. In this paper, we propose the Consistency-Aware Padding for Incomplete Multimodal Alignment Clustering Based on Self-Repellent Greedy Anchor Search(CAPIMAC) to tackle the problem of filling imbalanced and misaligned data in multimodal datasets. Specifically, we propose a self-repellent greedy anchor search module(SRGASM), which employs a self-repellent random walk combined with a greedy algorithm to identify anchor points for re-representing incomplete and misaligned multimodal data. Subsequently, based on noise-contrastive learning, we design a consistency-aware padding module (CAPM) to effectively interpolate and align imbalanced and misaligned data, thereby improving the quality of multimodal data fusion. Experimental results demonstrate the superiority of our method over benchmark datasets. The code will be publicly released at https://github.com/Autism-mm/CAPIMAC.git.

View on arXiv PDF

Similar