LGAIMay 24

QASA: Quality-Aware Semantic Augmentation for Robust Multimodal Sentiment Analysis

arXiv:2601.0687050.2h-index: 3
AI Analysis

For researchers in multimodal sentiment analysis, QASA provides an automated data augmentation strategy that enhances generalization and robustness without human annotation.

QASA uses diffusion models to generate augmented visual and auditory samples for multimodal sentiment analysis, improving model robustness under limited high-quality data. It achieves a relative increase of 18.0% in five-class accuracy and 5.9% in binary accuracy on CH-SIMS, and outperforms existing methods on CMU-MOSI and MUStARD.

Multimodal large language models have demonstrated strong ability in capturing semantic representations for multimodal sentiment analysis. Their capacity to learn stable and generalizable multimodal features is limited, however, by the scarcity of high-quality training data. To address this, we propose QASA (Quality-Aware Semantic Augmentation), which uses diffusion models to generate augmented visual and auditory samples, thereby enlarging the training dataset and supporting multimodal learning. The generated samples can vary in quality and may exhibit cross-modal inconsistencies. To manage this, we introduce a decoupled quality-aware scoring module that assigns training weights based on the reliability of each augmented sample. This approach reduces the influence of low-quality data and contributes to more stable and robust model training. The framework combines the generative capabilities of diffusion models with the semantic reasoning of multimodal large models, providing an automated data augmentation strategy that does not require human annotation while improving generalization and robustness under limited high-quality data. Experiments on the CH-SIMS dataset show that QASA yields a relative increase of 18.0\% and 5.9\% in five-class accuracy (Acc5) and binary accuracy (Acc2), respectively, and it also outperforms existing methods on the CMU-MOSI and MUStARD benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes