SD LG ASJul 14, 2022

Proceedings of the ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts

Alice Baird, Panagiotis Tzirakis, Gauthier Gidel, Marco Jiralerspong, Eilif B. Muller, Kory Mathewson, Björn Schuller, Erik Cambria, Dacher Keltner, Alan Cowen

arXiv:2207.06958v22.2

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of understanding and synthesizing non-verbal vocalizations for applications in emotional communication, but it is incremental as it builds on existing competition frameworks and datasets.

The paper presents the ICML 2022 Expressive Vocalizations (ExVo) Competition, which tackled the problem of recognizing, generating, and personalizing vocal bursts (e.g., laughs, gasps) for emotional expression, using a dataset of 59,201 vocalizations from 1,702 speakers across three competition tracks.

This is the Proceedings of the ICML Expressive Vocalization (ExVo) Competition. The ExVo competition focuses on understanding and generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are central to emotional expression and communication. ExVo 2022, included three competition tracks using a large-scale dataset of 59,201 vocalizations from 1,702 speakers. The first, ExVo-MultiTask, requires participants to train a multi-task model to recognize expressed emotions and demographic traits from vocal bursts. The second, ExVo-Generate, requires participants to train a generative model that produces vocal bursts conveying ten different emotions. The third, ExVo-FewShot, requires participants to leverage few-shot learning incorporating speaker identity to train a model for the recognition of 10 emotions conveyed by vocal bursts.

View on arXiv PDF

Similar