MMLGJun 14, 2023

Towards Balanced Active Learning for Multimodal Classification

arXiv:2306.08306v212 citationsh-index: 32
Originality Incremental advance
AI Analysis

This work addresses the issue of unfair data selection in multimodal active learning, which is crucial for reducing annotation costs while maintaining balanced performance across modalities, though it is incremental in improving existing strategies.

The paper tackles the problem of biased sample selection in multimodal active learning, where existing unimodal strategies favor the dominant modality, and proposes a novel approach that modulates gradient embeddings to achieve more balanced learning, outperforming existing methods on various multimodal classification tasks.

Training multimodal networks requires a vast amount of data due to their larger parameter space compared to unimodal networks. Active learning is a widely used technique for reducing data annotation costs by selecting only those samples that could contribute to improving model performance. However, current active learning strategies are mostly designed for unimodal tasks, and when applied to multimodal data, they often result in biased sample selection from the dominant modality. This unfairness hinders balanced multimodal learning, which is crucial for achieving optimal performance. To address this issue, we propose three guidelines for designing a more balanced multimodal active learning strategy. Following these guidelines, a novel approach is proposed to achieve more fair data selection by modulating the gradient embedding with the dominance degree among modalities. Our studies demonstrate that the proposed method achieves more balanced multimodal learning by avoiding greedy sample selection from the dominant modality. Our approach outperforms existing active learning strategies on a variety of multimodal classification tasks. Overall, our work highlights the importance of balancing sample selection in multimodal active learning and provides a practical solution for achieving more balanced active learning for multimodal classification.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes