IRCLLGMLJul 15, 2019

Multi-modal Sentiment Analysis using Deep Canonical Correlation Analysis

arXiv:1907.08696v125 citations
Originality Incremental advance
AI Analysis

This work addresses sentiment classification for multi-modal data, but it is incremental as it builds on existing DCCA methods with BERT integration.

The paper tackled multi-modal sentiment analysis by learning embeddings from text, audio, and video data using Deep Canonical Correlation Analysis (DCCA), and found that a one-step DCCA approach outperformed the state-of-the-art on benchmark datasets.

This paper learns multi-modal embeddings from text, audio, and video views/modes of data in order to improve upon down-stream sentiment classification. The experimental framework also allows investigation of the relative contributions of the individual views in the final multi-modal embedding. Individual features derived from the three views are combined into a multi-modal embedding using Deep Canonical Correlation Analysis (DCCA) in two ways i) One-Step DCCA and ii) Two-Step DCCA. This paper learns text embeddings using BERT, the current state-of-the-art in text encoders. We posit that this highly optimized algorithm dominates over the contribution of other views, though each view does contribute to the final result. Classification tasks are carried out on two benchmark datasets and on a new Debate Emotion data set, and together these demonstrate that the one-Step DCCA outperforms the current state-of-the-art in learning multi-modal embeddings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes