CVLGSep 19, 2025

Improving Autism Detection with Multimodal Behavioral Analysis

arXiv:2509.21352v13 citationsh-index: 8MICCAI
Originality Incremental advance
AI Analysis

This work addresses the need for scalable, video-based screening tools to support autism assessment, representing an incremental improvement over prior methods.

The paper tackled the problem of improving autism detection by analyzing multimodal behavioral cues from video data, achieving a classification accuracy of 74% through novel gaze descriptors that increased gaze-based accuracy from 64% to 69%.

Due to the complex and resource-intensive nature of diagnosing Autism Spectrum Condition (ASC), several computer-aided diagnostic support methods have been proposed to detect autism by analyzing behavioral cues in patient video data. While these models show promising results on some datasets, they struggle with poor gaze feature performance and lack of real-world generalizability. To tackle these challenges, we analyze a standardized video dataset comprising 168 participants with ASC (46% female) and 157 non-autistic participants (46% female), making it, to our knowledge, the largest and most balanced dataset available. We conduct a multimodal analysis of facial expressions, voice prosody, head motion, heart rate variability (HRV), and gaze behavior. To address the limitations of prior gaze models, we introduce novel statistical descriptors that quantify variability in eye gaze angles, improving gaze-based classification accuracy from 64% to 69% and aligning computational findings with clinical research on gaze aversion in ASC. Using late fusion, we achieve a classification accuracy of 74%, demonstrating the effectiveness of integrating behavioral markers across multiple modalities. Our findings highlight the potential for scalable, video-based screening tools to support autism assessment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes