CV SD ASAug 30, 2022

Video-based Cross-modal Auxiliary Network for Multimodal Sentiment Analysis

Rongfei Chen, Wenju Zhou, Yang Li, Huiyu Zhou

arXiv:2208.13954v15.735 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This work addresses multimodal sentiment analysis for applications relying on audiovisual data, presenting an incremental improvement by enhancing feature extraction and reducing redundancy.

The paper tackles the problem of insufficient unimodal feature extraction and data redundancy in multimodal sentiment analysis by proposing a Video-based Cross-modal Auxiliary Network (VCAN), which improves classification accuracy on benchmarks like RAVDESS, CMU-MOSI, and CMU-MOSEI, showing significant superiority over state-of-the-art methods.

Multimodal sentiment analysis has a wide range of applications due to its information complementarity in multimodal interactions. Previous works focus more on investigating efficient joint representations, but they rarely consider the insufficient unimodal features extraction and data redundancy of multimodal fusion. In this paper, a Video-based Cross-modal Auxiliary Network (VCAN) is proposed, which is comprised of an audio features map module and a cross-modal selection module. The first module is designed to substantially increase feature diversity in audio feature extraction, aiming to improve classification accuracy by providing more comprehensive acoustic representations. To empower the model to handle redundant visual features, the second module is addressed to efficiently filter the redundant visual frames during integrating audiovisual data. Moreover, a classifier group consisting of several image classification networks is introduced to predict sentiment polarities and emotion categories. Extensive experimental results on RAVDESS, CMU-MOSI, and CMU-MOSEI benchmarks indicate that VCAN is significantly superior to the state-of-the-art methods for improving the classification accuracy of multimodal sentiment analysis.

View on arXiv PDF Code

Similar