CVJun 14, 2023Code
MMASD: A Multimodal Dataset for Autism Intervention AnalysisJicheng Li, Vuthea Chheang, Pinar Kullu et al.
Autism spectrum disorder (ASD) is a developmental disorder characterized by significant social communication impairments and difficulties perceiving and presenting communication cues. Machine learning techniques have been broadly adopted to facilitate autism studies and assessments. However, computational models are primarily concentrated on specific analysis and validated on private datasets in the autism community, which limits comparisons across models due to privacy-preserving data sharing complications. This work presents a novel privacy-preserving open-source dataset, MMASD as a MultiModal ASD benchmark dataset, collected from play therapy interventions of children with Autism. MMASD includes data from 32 children with ASD, and 1,315 data samples segmented from over 100 hours of intervention recordings. To promote public access, each data sample consists of four privacy-preserving modalities of data; some of which are derived from original videos: (1) optical flow, (2) 2D skeleton, (3) 3D skeleton, and (4) clinician ASD evaluation scores of children, e.g., ADOS scores. MMASD aims to assist researchers and therapists in understanding children's cognitive status, monitoring their progress during therapy, and customizing the treatment plan accordingly. It also has inspiration for downstream tasks such as action quality assessment and interpersonal synchrony estimation. MMASD dataset can be easily accessed at https://github.com/Li-Jicheng/MMASD-A-Multimodal-Dataset-for-Autism-Intervention-Analysis.
CVAug 27, 2024Code
MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum DisorderPavan Uttej Ravva, Behdokht Kiafar, Pinar Kullu et al.
Autism spectrum disorder (ASD) is characterized by significant challenges in social interaction and comprehending communication signals. Recently, therapeutic interventions for ASD have increasingly utilized Deep learning powered-computer vision techniques to monitor individual progress over time. These models are trained on private, non-public datasets from the autism community, creating challenges in comparing results across different models due to privacy-preserving data-sharing issues. This work introduces MMASD+, an enhanced version of the novel open-source dataset called Multimodal ASD (MMASD). MMASD+ consists of diverse data modalities, including 3D-Skeleton, 3D Body Mesh, and Optical Flow data. It integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and children, addressing a significant barrier in the original dataset. Additionally, a Multimodal Transformer framework is proposed to predict 11 action types and the presence of ASD. This framework achieves an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities. These findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.
71.8HCApr 27
AFA: Identity-Aware Memory for Preventing Persona Confusion in Multi-User DialogueMohammad Al-Ratrout, Pavan Uttej Ravva, Shayla Sharmin et al.
When multiple people share a single voice assistant, the system conflates their histories: one resident's preferences can leak into another's responses, eroding utility and trust. We call this failure mode persona confusion, and we show it is a measurable problem in today's single-user dialogue systems when deployed in shared environments. We present the Adaptive Friend Agent (AFA), a modular framework that combines voice-based speaker identification with per-user memory stores to enable identity-aware, personalized dialogue across multiple users. To support training and evaluation, we construct PAT (Personalized Agent chaT), a synthetic dataset of 58,289 persona-grounded dialogue turns spanning 133 user profiles and 12 real-world scenarios. We evaluate AFA across five LLM back-ends in a standard response-quality benchmark, with a LLaMA-2-70B model fine-tuned on PAT achieving the highest overall performance. To directly measure persona confusion prevention, we introduce an interleaved multi-user evaluation protocol with a novel metric, Persona Attribution Accuracy (PAA), demonstrating that identity-aware routing improves PAA from 35.7% to 61.3%. Human evaluation confirms annotators perceive significantly higher personalization in routing-enabled responses. Our results establish that identity-aware user routing is the critical component for preventing persona confusion in multi-user conversational systems.
32.5HCApr 9
Beyond Cognitive Load: AI-Based Estimation of Cognitive Effort Using Brain Signals During Digital TasksShayla Sharmin, Mohammad Fahim Abrar, Gael Lucero-Palacios et al.
Cognitive effort, defined as the relationship between cognitive load and task performance, provides insight into how individuals allocate mental resources during demanding tasks. This construct is particularly important in high-stakes public health and clinical training, where excessive cognitive load is associated with medical errors and burnout. This study investigates whether cognitive effort varies across task segments and whether it can be estimated at the individual level using brain signal data and machine learning. Functional near-infrared spectroscopy (fNIRS) data were collected from 16 participants performing a structured digital cognitive task consisting of four sequential segments separated by short and long rest intervals. Cognitive effort was operationalized using relative neural efficiency and relative neural involvement, integrating prefrontal hemodynamic activity with task performance. The analysis followed a two-stage approach. First, segment-level group analysis tested whether cognitive effort differed across task segments, assessing whether the task structure induced meaningful variation in cognitive demand. Second, participant-independent machine learning models were used to predict task performance from brain signal features. These predicted scores were then combined with neural measures to estimate individual-level cognitive effort. Results showed significant differences in cognitive effort across the four task segments, indicating that variations in task structure influence collective cognitive efficiency. In addition, machine learning models successfully predicted performance from fNIRS data. Cognitive effort derived from predicted scores closely matched that based on actual performance, suggesting that the proposed metric primarily reflects brain signal patterns.
HCApr 3, 2025
Hybrid Deep Learning Model to Estimate Cognitive Effort from fNIRS SignalsShayla Sharmin, Roghayeh Leila Barmaki
This study estimates cognitive effort based on functional near-infrared spectroscopy data and performance scores using a hybrid DeepNet model. The estimation of cognitive effort enables educators to modify material to enhance learning effectiveness and student engagement. In this study, we collected oxygenated hemoglobin using functional near-infrared spectroscopy during an educational quiz game. Participants (n=16) responded to 16 questions in a Unity-based educational game, each within a 30-second response time limit. We used DeepNet models to predict the performance score from the oxygenated hemoglobin, and compared traditional machine learning and DeepNet models to determine which approach provides better accuracy in predicting performance scores. The result shows that the proposed CNN-GRU gives better performance with 73% than other models. After the prediction, we used the predicted score and the oxygenated hemoglobin to observe cognitive effort by calculating relative neural efficiency and involvement in our test cases. Our result shows that even with moderate accuracy, the predicted cognitive effort closely follow the actual trends. This findings can be helpful in designing and improving learning environments and provide valuable insights into learning materials.