CVLGJun 7, 2024

Predictive Dynamic Fusion

arXiv:2406.04802v335 citationsHas Code
AI Analysis

This work addresses reliability and stability issues in multimodal learning for joint decision-making systems, representing an incremental improvement with theoretical backing.

The paper tackles the lack of theoretical guarantees and suboptimal performance in dynamic multimodal fusion methods by proposing a Predictive Dynamic Fusion (PDF) framework, which theoretically derives predictable Collaborative Belief with confidence measures to reduce generalization error and demonstrates superiority in experiments on multiple benchmarks.

Multimodal fusion is crucial in joint decision-making systems for rendering holistic judgments. Since multimodal data changes in open environments, dynamic fusion has emerged and achieved remarkable progress in numerous applications. However, most existing dynamic multimodal fusion methods lack theoretical guarantees and easily fall into suboptimal problems, yielding unreliability and instability. To address this issue, we propose a Predictive Dynamic Fusion (PDF) framework for multimodal learning. We proceed to reveal the multimodal fusion from a generalization perspective and theoretically derive the predictable Collaborative Belief (Co-Belief) with Mono- and Holo-Confidence, which provably reduces the upper bound of generalization error. Accordingly, we further propose a relative calibration strategy to calibrate the predicted Co-Belief for potential uncertainty. Extensive experiments on multiple benchmarks confirm our superiority. Our code is available at https://github.com/Yinan-Xia/PDF.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes