Charles Brazier

AS
h-index14
7papers
687citations
Novelty45%
AI Score28

7 Papers

CLAug 6, 2024
Conditioning LLMs with Emotion in Neural Machine Translation

Charles Brazier, Jean-Luc Rouas

Large Language Models (LLMs) have shown remarkable performance in Natural Language Processing tasks, including Machine Translation (MT). In this work, we propose a novel MT pipeline that integrates emotion information extracted from a Speech Emotion Recognition (SER) model into LLMs to enhance translation quality. We first fine-tune five existing LLMs on the Libri-trans dataset and select the most performant model. Subsequently, we augment LLM prompts with different dimensional emotions and train the selected LLM under these different configurations. Our experiments reveal that integrating emotion information, especially arousal, into LLM prompts leads to notable improvements in translation quality.

CLApr 27, 2024
Usefulness of Emotional Prosody in Neural Machine Translation

Charles Brazier, Jean-Luc Rouas

Neural Machine Translation (NMT) is the task of translating a text from one language to another with the use of a trained neural network. Several existing works aim at incorporating external information into NMT models to improve or control predicted translations (e.g. sentiment, politeness, gender). In this work, we propose to improve translation quality by adding another external source of information: the automatically recognized emotion in the voice. This work is motivated by the assumption that each emotion is associated with a specific lexicon that can overlap between emotions. Our proposed method follows a two-stage procedure. At first, we select a state-of-the-art Speech Emotion Recognition (SER) model to predict dimensional emotion values from all input audio in the dataset. Then, we use these predicted emotions as source tokens added at the beginning of input texts to train our NMT model. We show that integrating emotion information, especially arousal, into NMT systems leads to better translations.

ASOct 6, 2021
Improving Real-time Score Following in Opera by Combining Music with Lyrics Tracking

Charles Brazier, Gerhard Widmer

Fully automatic opera tracking is challenging because of the acoustic complexity of the genre, combining musical and linguistic information (singing, speech) in complex ways. In this paper, we propose a new pipeline for complete opera tracking. The pipeline is based on two trackers. A music tracker that has proven to be effective at tracking orchestral parts, will lead the tracking process. In addition, a lyrics tracker, that has recently been shown to reliably track the lyrics of opera songs, will correct the music tracker when tracking parts that have a text dominance over the music. We will demonstrate the efficiency of this method on the opera Don Giovanni, showing that this technique helps improving accuracy and robustness of a complete opera tracker.

ASMay 18, 2021
Handling Structural Mismatches in Real-time Opera Tracking

Charles Brazier, Gerhard Widmer

Algorithms for reliable real-time score following in live opera promise a lot of useful applications such as automatic subtitles display, or real-time video cutting in live streaming. Until now, such systems were based on the strong assumption that an opera performance follows the structure of the score linearly. However, this is rarely the case in practice, because of different opera versions and directors' cutting choices. In this paper, we propose a two-level solution to this problem. We introduce a real-time-capable, high-resolution (HR) tracker that can handle jumps or repetitions at specific locations provided to it. We then combine this with an additional low-resolution (LR) tracker that can handle all sorts of mismatches that can occur at any time, with some imprecision, and can re-direct the HR tracker if the latter is `lost' in the score. We show that the combination of the two improves tracking robustness in the presence of strong structural mismatches.

ASOct 21, 2020
Addressing the Recitative Problem in Real-time Opera Tracking

Charles Brazier, Gerhard Widmer

Robust real-time opera tracking (score following) would be extremely useful for many processes surrounding live opera staging and streaming, including automatic lyrics displays, camera control, or live video cutting. Recent work has shown that, with some appropriate measures to account for common problems such as breaks and interruptions, spontaneous applause, various noises and interludes, current audio-to-audio alignment algorithms can be made to follow an entire opera from beginning to end, in a relatively robust way. However, they remain inaccurate when the textual content becomes prominent against the melody or music -- notably, during recitativo passages. In this paper, we address this specific problem by proposing to use two specialized trackers in parallel, one focusing on music-, the other on speech-sensitive features. We first carry out a systematic study on speech-related features, targeting the precise alignment of corresponding recitatives from different performances of the same opera. Then we propose different solutions, based on pre-trained music and speech classifiers, to combine the two trackers in order to improve the global accuracy over the course of the entire opera.

ASJun 19, 2020
Towards Reliable Real-time Opera Tracking: Combining Alignment with Audio Event Detectors to Increase Robustness

Charles Brazier, Gerhard Widmer

Recent advances in real-time music score following have made it possible for machines to automatically track highly complex polyphonic music, including full orchestra performances. In this paper, we attempt to take this to an even higher level, namely, live tracking of full operas. We first apply a state-of-the-art audio alignment method based on online Dynamic Time-Warping (OLTW) to full-length recordings of a Mozart opera and, analyzing the tracker's most severe errors, identify three common sources of problems specific to the opera scenario. To address these, we propose a combination of a DTW-based music tracker with specialized audio event detectors (for applause, silence/noise, and speech) that condition the DTW algorithm in a top-down fashion, and show, step by step, how these detectors add robustness to the score follower. However, there remain a number of open problems which we identify as targets for ongoing and future research.

SDJun 17, 2020
Real-time visualisation of fugue played by a string quartet

Olivier Lartillot, Carlos Cancino-Chacón, Charles Brazier

We present a new system for real-time visualisation of music performance, focused for the moment on a fugue played by a string quartet. The basic principle is to offer a visual guide to better understand music using strategies that should be as engaging, accessible and effective as possible. The pitch curves related to the separate voices are drawn on a space whose temporal axis is normalised with respect to metrical positions, and aligned vertically with respect to their thematic and motivic classification. Aspects related to tonality are represented as well. We describe the underlying technologies we have developed and the technical setting. In particular, the rhythmical and structural representation of the piece relies on real-time polyphonic audio-to-score alignment using online dynamic time warping. The visualisation will be presented at a concert of the Danish String Quartet, performing the last piece of The Art of Fugue by Johann Sebastian Bach.