LG AIOct 23, 2025

Multimodal Negative Learning

Baoquan Gong, Xiyuan Gao, Pengfei Zhu, Qinghua Hu, Bing Cao

arXiv:2510.20877v11 citationsh-index: 14Has Code

Originality Highly original

AI Analysis

This addresses modality imbalance in multimodal learning systems, which can hinder performance in applications like computer vision or natural language processing, and is a novel paradigm rather than incremental.

The paper tackles the problem of modality imbalance in multimodal learning by introducing a new paradigm called Multimodal Negative Learning, which dynamically guides weak modalities to suppress non-target classes, resulting in improved robustness and reduced empirical error for weak modalities under noisy and imbalanced scenarios.

Multimodal learning systems often encounter challenges related to modality imbalance, where a dominant modality may overshadow others, thereby hindering the learning of weak modalities. Conventional approaches often force weak modalities to align with dominant ones in "Learning to be (the same)" (Positive Learning), which risks suppressing the unique information inherent in the weak modalities. To address this challenge, we offer a new learning paradigm: "Learning Not to be" (Negative Learning). Instead of enhancing weak modalities' target-class predictions, the dominant modalities dynamically guide the weak modality to suppress non-target classes. This stabilizes the decision space and preserves modality-specific information, allowing weak modalities to preserve unique information without being over-aligned. We proceed to reveal multimodal learning from a robustness perspective and theoretically derive the Multimodal Negative Learning (MNL) framework, which introduces a dynamic guidance mechanism tailored for negative learning. Our method provably tightens the robustness lower bound of multimodal learning by increasing the Unimodal Confidence Margin (UCoM) and reduces the empirical error of weak modalities, particularly under noisy and imbalanced scenarios. Extensive experiments across multiple benchmarks demonstrate the effectiveness and generalizability of our approach against competing methods. The code will be available at https://github.com/BaoquanGong/Multimodal-Negative-Learning.git.

View on arXiv PDF Code

Similar