Autism Detection in Speech -- A Survey
This is an incremental survey paper that synthesizes existing research on autism detection through speech analysis for researchers and clinicians.
This survey analyzes studies from biomedical, psychological, and NLP domains to identify linguistic, prosodic, and acoustic cues for autism detection in speech, concluding that female patients are under-researched and there is a lack of research combining audio and transcript features.
There has been a range of studies of how autism is displayed in voice, speech, and language. We analyse studies from the biomedical, as well as the psychological domain, but also from the NLP domain in order to find linguistic, prosodic and acoustic cues that could indicate autism. Our survey looks at all three domains. We define autism and which comorbidities might influence the correct detection of the disorder. We especially look at observations such as verbal and semantic fluency, prosodic features, but also disfluencies and speaking rate. We also show word-based approaches and describe machine learning and transformer-based approaches both on the audio data as well as the transcripts. Lastly, we conclude, while there already is a lot of research, female patients seem to be severely under-researched. Also, most NLP research focuses on traditional machine learning methods instead of transformers which could be beneficial in this context. Additionally, we were unable to find research combining both features from audio and transcripts.