CL SD ASDec 21, 2023

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

Maureen de Seyssel, Antony D'Avirro, Adina Williams, Emmanuel Dupoux

arXiv:2312.14069v211.131 citationsh-index: 19Has CodeEMNLP

Originality Synthesis-oriented

AI Analysis

This provides a new benchmark for researchers in speech processing to assess emphasis transfer, but it is incremental as it focuses on evaluation rather than a novel model.

The authors tackled the problem of evaluating prosodic emphasis transfer in speech-to-speech models by introducing EmphAssess, a benchmark applied to speech resynthesis and translation tasks, which includes EmphaClass for emphasis classification.

We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to two tasks: speech resynthesis and speech-to-speech translation. In both cases, the benchmark evaluates the ability of the model to encode emphasis in the speech input and accurately reproduce it in the output, potentially across a change of speaker and language. As part of the evaluation pipeline, we introduce EmphaClass, a new model that classifies emphasis at the frame or word level.

View on arXiv PDF Code

Similar