SDFeb 20, 2023
Towards Measuring and Scoring Speaker Diarization FairnessYannis Tevissen, Jérôme Boudy, Gérard Chollet et al.
Speaker diarization, or the task of finding "who spoke and when", is now used in almost every speech processing application. Nevertheless, its fairness has not yet been evaluated because there was no protocol to study its biases one by one. In this paper we propose a protocol and a scoring method designed to evaluate speaker diarization fairness. This protocol is applied on a large dataset of spoken utterances and report the performances of speaker diarization depending on the gender, the age, the accent of the speaker and the length of the spoken sentence. Some biases induced by the gender, or the accent of the speaker were identified when we applied a state-of-the-art speaker diarization method.
HCApr 28, 2021
The EMPATHIC Project: Building an Expressive, Advanced Virtual Coach to Improve Independent Healthy-Life-Years of the ElderlyLuisa Brinkschulte, Natascha Mariacher, Stephan Schlögl et al.
This paper outlines the EMPATHIC Research & Innovation project, which aims to research, innovate, explore and validate new interaction paradigms and plat-forms for future generations of Personalized Virtual Coaches to assist elderly people living independently at and around their home. Innovative multimodal face analytics, adaptive spoken dialogue systems, and natural language inter-faces are part of what the project investigates and innovates, aiming to help dependent aging persons and their carers. It will uses remote, non-intrusive technologies to extract physiological markers of emotional states and adapt respective coach responses. In doing so, it aims to develop causal models for emotionally believable coach-user interactions, which shall engage elders and thus keep off loneliness, sustain health, enhance quality of life, and simplify access to future telecare services. Through measurable end-user validations performed in Spain, Norway and France (and complementary user evaluations in Italy), the proposed methods and solutions will have to demonstrate useful-ness, reliability, flexibility and robustness.
CVAug 29, 2019
3D Face Pose and Animation Tracking via Eigen-Decomposition based Bayesian ApproachNgoc-Trung Tran, Fakhr-Eddine Ababsa, Maurice Charbit et al.
This paper presents a new method to track both the face pose and the face animation with a monocular camera. The approach is based on the 3D face model CANDIDE and on the SIFT (Scale Invariant Feature Transform) descriptors, extracted around a few given landmarks (26 selected vertices of CANDIDE model) with a Bayesian approach. The training phase is performed on a synthetic database generated from the first video frame. At each current frame, the face pose and animation parameters are estimated via a Bayesian approach, with a Gaussian prior and a Gaussian likelihood function whose the mean and the covariance matrix eigenvalues are updated from the previous frame using eigen decomposition. Numerical results on pose estimation and landmark locations are reported using the Boston University Face Tracking (BUFT) database and Talking Face video. They show that our approach, compared to six other published algorithms, provides a very good compromise and presents a promising perspective due to the good results in terms of landmark localization.
SDNov 2, 2016
The Intelligent Voice 2016 Speaker Recognition SystemAbbas Khosravani, Cornelius Glackin, Nazim Dugan et al.
This paper presents the Intelligent Voice (IV) system submitted to the NIST 2016 Speaker Recognition Evaluation (SRE). The primary emphasis of SRE this year was on developing speaker recognition technology which is robust for novel languages that are much more heterogeneous than those used in the current state-of-the-art, using significantly less training data, that does not contain meta-data from those languages. The system is based on the state-of-the-art i-vector/PLDA which is developed on the fixed training condition, and the results are reported on the protocol defined on the development set of the challenge.