CVMay 8, 2025

xTrace: A Facial Expressive Behaviour Analysis Tool for Continuous Affect Recognition

arXiv:2505.05043v2h-index: 50
Originality Incremental advance
AI Analysis

This work addresses the problem of real-time facial expressive behavior analysis for affective computing, though it appears incremental as it builds on existing tools and benchmarks.

The paper tackles the challenge of building a robust system for continuous affect recognition from in-the-wild face videos by introducing xTrace, which achieves a mean Concordance Correlation Coefficient of 0.86 on a benchmark set and outperforms existing state-of-the-art methods by ~7.1%.

Recognising expressive behaviours in face videos is a long-standing challenge in Affective Computing. Despite significant advancements in recent years, it still remains a challenge to build a robust and reliable system for naturalistic and in-the-wild facial expressive behaviour analysis in real time. This paper addresses two key challenges in building such a system: (1). The paucity of large-scale labelled facial affect video datasets with extensive coverage of the 2D emotion space, and (2). The difficulty of extracting facial video features that are discriminative, interpretable, robust, and computationally efficient. Toward addressing these challenges, this work introduces xTrace, a robust tool for facial expressive behaviour analysis and predicting continuous values of dimensional emotions, namely valence and arousal, from in-the-wild face videos. To address challenge (1), the proposed affect recognition model is trained on the largest facial affect video data set, containing $\sim$450k videos that cover most emotion zones in the dimensional emotion space, making xTrace highly versatile in analysing a wide spectrum of naturalistic expressive behaviours. To address challenge (2), xTrace uses facial affect descriptors that are not only explainable, but can also achieve a high degree of accuracy and robustness with low computational complexity. The key components of xTrace are benchmarked against three existing tools: MediaPipe, OpenFace, and Augsburg Affect Toolbox. On an in-the-wild benchmarking set composed of $\sim$50k videos, xTrace achieves 0.86 mean Concordance Correlation Coefficient (CCC) and on the SEWA test set it achieves 0.75 mean CCC, outperforming existing SOTA by $\sim$7.1\%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes