HC ASMar 17

Collecting Prosody in the Wild: A Content-Controlled, Privacy-First Smartphone Protocol and Empirical Evaluation

Timo K. Koch, Florian Bemmann, Ramona Schoedel, Markus Buehner, Clemens Stachl

arXiv:2603.170612.5h-index: 20

Predicted impact top 89% in HC · last 90 daysOriginality Synthesis-oriented

AI Analysis

This addresses the problem of confounding prosody and semantics, privacy issues, and participant compliance in speech data collection for researchers, though it is incremental as it builds on existing methods with a new protocol.

The researchers tackled the challenge of collecting everyday speech data for prosodic analysis by developing a smartphone protocol that standardizes lexical content and prioritizes privacy, deploying it in a large study with 560 participants and 9,877 recordings to evaluate compliance and data quality.

Collecting everyday speech data for prosodic analysis is challenging due to the confounding of prosody and semantics, privacy constraints, and participant compliance. We introduce and empirically evaluate a content-controlled, privacy-first smartphone protocol that uses scripted read-aloud sentences to standardize lexical content (including prompt valence) while capturing natural variation in prosodic delivery. The protocol performs on-device prosodic feature extraction, deletes raw audio immediately, and transmits only derived features for analysis. We deployed the protocol in a large study (N = 560; 9,877 recordings), evaluated compliance and data quality, and conducted diagnostic prediction tasks on the extracted features, predicting speaker sex and concurrently reported momentary affective states (valence, arousal). We discuss implications and directions for advancing and deploying the protocol.

View on arXiv PDF

Similar