Emotion Recognition with Pre-Trained Transformers Using Multimodal Signals
This work addresses emotion recognition for applications in human-computer interaction, but it is incremental as it applies existing methods to a specific domain.
The paper tackled multimodal emotion recognition from physiological signals by using a Transformer-based approach with multimodal pre-training, achieving improved performance on a state-of-the-art dataset.
In this paper, we address the problem of multimodal emotion recognition from multiple physiological signals. We demonstrate that a Transformer-based approach is suitable for this task. In addition, we present how such models may be pretrained in a multimodal scenario to improve emotion recognition performances. We evaluate the benefits of using multimodal inputs and pre-training with our approach on a state-ofthe-art dataset.