CL LG SD ASMay 19, 2025

Suicide Risk Assessment Using Multimodal Speech Features: A Study on the SW1 Challenge Dataset

Ambre Marie, Ilias Maoudj, Guillaume Dardenne, Gwenolé Quellec

arXiv:2505.13069v12.7h-index: 39

Originality Synthesis-oriented

AI Analysis

This work addresses suicide risk assessment for adolescents, but it is incremental as it applies existing methods to a new dataset with limited generalization improvements.

The study tackled suicide risk assessment in adolescents using multimodal speech features, achieving 69% accuracy on a development set with a weighted attention fusion method, but noted generalization challenges between development and test sets.

The 1st SpeechWellness Challenge conveys the need for speech-based suicide risk assessment in adolescents. This study investigates a multimodal approach for this challenge, integrating automatic transcription with WhisperX, linguistic embeddings from Chinese RoBERTa, and audio embeddings from WavLM. Additionally, handcrafted acoustic features -- including MFCCs, spectral contrast, and pitch-related statistics -- were incorporated. We explored three fusion strategies: early concatenation, modality-specific processing, and weighted attention with mixup regularization. Results show that weighted attention provided the best generalization, achieving 69% accuracy on the development set, though a performance gap between development and test sets highlights generalization challenges. Our findings, strictly tied to the MINI-KID framework, emphasize the importance of refining embedding representations and fusion mechanisms to enhance classification reliability.

View on arXiv PDF

Similar