CVATOct 26, 2021

Emotion recognition in talking-face videos using persistent entropy and neural networks

arXiv:2110.13571v1
Originality Incremental advance
AI Analysis

This work addresses emotion recognition for applications in human-computer interaction, but it is incremental as it builds on existing methods with a novel combination of tools.

The authors tackled emotion recognition from talking-face videos by combining audio and image data into a topological signature, which was then classified by a neural network into eight emotions, achieving competitive results that outperform other state-of-the-art methods.

The automatic recognition of a person's emotional state has become a very active research field that involves scientists specialized in different areas such as artificial intelligence, computer vision or psychology, among others. Our main objective in this work is to develop a novel approach, using persistent entropy and neural networks as main tools, to recognise and classify emotions from talking-face videos. Specifically, we combine audio-signal and image-sequence information to compute a topology signature(a 9-dimensional vector) for each video. We prove that small changes in the video produce small changes in the signature. These topological signatures are used to feed a neural network to distinguish between the following emotions: neutral, calm, happy, sad, angry, fearful, disgust, and surprised. The results reached are promising and competitive, beating the performance reached in other state-of-the-art works found in the literature.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes