CLFeb 25, 2025

Uncertainty Modeling in Multimodal Speech Analysis Across the Psychosis Spectrum

arXiv:2502.18285v15 citationsh-index: 9npj Digital Medicine
Originality Incremental advance
AI Analysis

This work addresses the problem of early detection and personalized assessment in psychosis-spectrum research, representing an incremental advance by applying uncertainty modeling to existing multimodal methods.

The paper tackled the challenge of predicting symptom severity and psychosis-related traits from speech data by developing an uncertainty-aware model that integrates acoustic and linguistic features, resulting in improved prediction accuracy with an F1-score of 83% and reduced RMSE.

Capturing subtle speech disruptions across the psychosis spectrum is challenging because of the inherent variability in speech patterns. This variability reflects individual differences and the fluctuating nature of symptoms in both clinical and non-clinical populations. Accounting for uncertainty in speech data is essential for predicting symptom severity and improving diagnostic precision. Speech disruptions characteristic of psychosis appear across the spectrum, including in non-clinical individuals. We develop an uncertainty-aware model integrating acoustic and linguistic features to predict symptom severity and psychosis-related traits. Quantifying uncertainty in specific modalities allows the model to address speech variability, improving prediction accuracy. We analyzed speech data from 114 participants, including 32 individuals with early psychosis and 82 with low or high schizotypy, collected through structured interviews, semi-structured autobiographical tasks, and narrative-driven interactions in German. The model improved prediction accuracy, reducing RMSE and achieving an F1-score of 83% with ECE = 4.5e-2, showing robust performance across different interaction contexts. Uncertainty estimation improved model interpretability by identifying reliability differences in speech markers such as pitch variability, fluency disruptions, and spectral instability. The model dynamically adjusted to task structures, weighting acoustic features more in structured settings and linguistic features in unstructured contexts. This approach strengthens early detection, personalized assessment, and clinical decision-making in psychosis-spectrum research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes