Dynamic Stress Detection: A Study of Temporal Progression Modelling of Stress in Speech
This work addresses stress detection in high-pressure settings by introducing dynamic modeling, though it is incremental in improving existing methods.
The paper tackled the problem of detecting psychological stress from speech by modeling it as a temporally evolving phenomenon, achieving accuracy gains of +5% on MuSE and +18% on StressID over existing baselines.
Detecting psychological stress from speech is critical in high-pressure settings. While prior work has leveraged acoustic features for stress detection, most treat stress as a static label. In this work, we model stress as a temporally evolving phenomenon influenced by historical emotional state. We propose a dynamic labelling strategy that derives fine-grained stress annotations from emotional labels and introduce cross-attention-based sequential models, a Unidirectional LSTM and a Transformer Encoder, to capture temporal stress progression. Our approach achieves notable accuracy gains on MuSE (+5%) and StressID (+18%) over existing baselines, and generalises well to a custom real-world dataset. These results highlight the value of modelling stress as a dynamic construct in speech.