SD AI ASSep 14, 2024

Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational Modeling

Tiantian Feng, Anfeng Xu, Xuan Shi, Somer Bishop, Shrikanth Narayanan

arXiv:2409.09340v24.92 citationsh-index: 18

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of speaker classification in ASD behavioral assessments, offering a novel approach that could enhance treatment monitoring, though it appears incremental by building on existing methods with new data and pre-training.

The study tackled the problem of automatically identifying speakers in child-adult dyadic interactions for autism spectrum disorder (ASD) treatment evaluation by using egocentric speech data from wearable sensors and pre-training on Ego4D samples, resulting in improved speaker classification accuracy.

Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by challenges in social communication, repetitive behavior, and sensory processing. One important research area in ASD is evaluating children's behavioral changes over time during treatment. The standard protocol with this objective is BOSCC, which involves dyadic interactions between a child and clinicians performing a pre-defined set of activities. A fundamental aspect of understanding children's behavior in these interactions is automatic speech understanding, particularly identifying who speaks and when. Conventional approaches in this area heavily rely on speech samples recorded from a spectator perspective, and there is limited research on egocentric speech modeling. In this study, we design an experiment to perform speech sampling in BOSCC interviews from an egocentric perspective using wearable sensors and explore pre-training Ego4D speech samples to enhance child-adult speaker classification in dyadic interactions. Our findings highlight the potential of egocentric speech collection and pre-training to improve speaker classification accuracy.

View on arXiv PDF

Similar