CVSep 18, 2020

Learning Unseen Emotions from Gestures via Semantically-Conditioned Zero-Shot Perception with Adversarial Autoencoders

arXiv:2009.08906v219 citations
AI Analysis

This work addresses the challenge of emotion recognition from gestures for applications in human-computer interaction, but it is incremental as it builds on existing zero-shot learning methods.

The paper tackles the problem of recognizing unseen emotions from gestures using a generalized zero-shot learning algorithm, achieving an accuracy of 58.43% on the MPI Emotional Body Expressions Database, which improves state-of-the-art performance by 25–27%.

We present a novel generalized zero-shot algorithm to recognize perceived emotions from gestures. Our task is to map gestures to novel emotion categories not encountered in training. We introduce an adversarial, autoencoder-based representation learning that correlates 3D motion-captured gesture sequence with the vectorized representation of the natural-language perceived emotion terms using word2vec embeddings. The language-semantic embedding provides a representation of the emotion label space, and we leverage this underlying distribution to map the gesture-sequences to the appropriate categorical emotion labels. We train our method using a combination of gestures annotated with known emotion terms and gestures not annotated with any emotions. We evaluate our method on the MPI Emotional Body Expressions Database (EBEDB) and obtain an accuracy of $58.43\%$. This improves the performance of current state-of-the-art algorithms for generalized zero-shot learning by $25$--$27\%$ on the absolute.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes