Pavan Uttej Ravva

CVAug 27, 2024Code

MMASD+: A Novel Dataset for Privacy-Preserving Behavior Analysis of Children with Autism Spectrum Disorder

Pavan Uttej Ravva, Behdokht Kiafar, Pinar Kullu et al.

Autism spectrum disorder (ASD) is characterized by significant challenges in social interaction and comprehending communication signals. Recently, therapeutic interventions for ASD have increasingly utilized Deep learning powered-computer vision techniques to monitor individual progress over time. These models are trained on private, non-public datasets from the autism community, creating challenges in comparing results across different models due to privacy-preserving data-sharing issues. This work introduces MMASD+, an enhanced version of the novel open-source dataset called Multimodal ASD (MMASD). MMASD+ consists of diverse data modalities, including 3D-Skeleton, 3D Body Mesh, and Optical Flow data. It integrates the capabilities of Yolov8 and Deep SORT algorithms to distinguish between the therapist and children, addressing a significant barrier in the original dataset. Additionally, a Multimodal Transformer framework is proposed to predict 11 action types and the presence of ASD. This framework achieves an accuracy of 95.03% for predicting action types and 96.42% for predicting ASD presence, demonstrating over a 10% improvement compared to models trained on single data modalities. These findings highlight the advantages of integrating multiple data modalities within the Multimodal Transformer framework.

45.9HCApr 27

AFA: Identity-Aware Memory for Preventing Persona Confusion in Multi-User Dialogue

Mohammad Al-Ratrout, Pavan Uttej Ravva, Shayla Sharmin et al.

When multiple people share a single voice assistant, the system conflates their histories: one resident's preferences can leak into another's responses, eroding utility and trust. We call this failure mode persona confusion, and we show it is a measurable problem in today's single-user dialogue systems when deployed in shared environments. We present the Adaptive Friend Agent (AFA), a modular framework that combines voice-based speaker identification with per-user memory stores to enable identity-aware, personalized dialogue across multiple users. To support training and evaluation, we construct PAT (Personalized Agent chaT), a synthetic dataset of 58,289 persona-grounded dialogue turns spanning 133 user profiles and 12 real-world scenarios. We evaluate AFA across five LLM back-ends in a standard response-quality benchmark, with a LLaMA-2-70B model fine-tuned on PAT achieving the highest overall performance. To directly measure persona confusion prevention, we introduce an interleaved multi-user evaluation protocol with a novel metric, Persona Attribution Accuracy (PAA), demonstrating that identity-aware routing improves PAA from 35.7% to 61.3%. Human evaluation confirms annotators perceive significantly higher personalization in routing-enabled responses. Our results establish that identity-aware user routing is the critical component for preventing persona confusion in multi-user conversational systems.

Pavan Uttej Ravva

2 Papers