HCNov 7, 2025
Enhancing Public Speaking Skills in Engineering Students Through AIAmol Harsh, Brainerd Prince, Siddharth Siddharth et al.
This research-to-practice full paper was inspired by the persistent challenge in effective communication among engineering students. Public speaking is a necessary skill for future engineers as they have to communicate technical knowledge with diverse stakeholders. While universities offer courses or workshops, they are unable to offer sustained and personalized training to students. Providing comprehensive feedback on both verbal and non-verbal aspects of public speaking is time-intensive, making consistent and individualized assessment impractical. This study integrates research on verbal and non-verbal cues in public speaking to develop an AI-driven assessment model for engineering students. Our approach combines speech analysis, computer vision, and sentiment detection into a multi-modal AI system that provides assessment and feedback. The model evaluates (1) verbal communication (pitch, loudness, pacing, intonation), (2) non-verbal communication (facial expressions, gestures, posture), and (3) expressive coherence, a novel integration ensuring alignment between speech and body language. Unlike previous systems that assess these aspects separately, our model fuses multiple modalities to deliver personalized, scalable feedback. Preliminary testing demonstrated that our AI-generated feedback was moderately aligned with expert evaluations. Among the state-of-the-art AI models evaluated, all of which were Large Language Models (LLMs), including Gemini and OpenAI models, Gemini Pro emerged as the best-performing, showing the strongest agreement with human annotators. By eliminating reliance on human evaluators, this AI-driven public speaking trainer enables repeated practice, helping students naturally align their speech with body language and emotion, crucial for impactful and professional communication.
HCMay 10
AwareLLM: A Proactive Multimodal Ecosystem for Personalized Human-AI Collaboration to Enhance ProductivityAmog Rao, Utkarsh Agarwal, Amol Harsh et al.
Information workers' productivity is significantly influenced by their cognitive states and physiological responses. AI assistants such as ChatGPT, Copilot, and others have become integral components of knowledge-intensive workplaces. These AI assistants utilize pre-defined user preferences and chat interaction histories, thus confining themselves to reactive exchanges, lacking sufficient adaptability. Consequently, they fail to cater to individual user preferences and are unable to adapt to their psychophysiological states, diminishing potential productivity gains. To bridge this gap, we introduce AwareLLM, a novel multimodal framework that integrates egocentric vision, pupillometry, eye-gaze tracking, posture detection, heart activity, and the inferencing capabilities of large language models (LLMs) to create a proactive and context-aware ecosystem. AwareLLM dynamically adapts to users' psychophysiological states while analyzing temporal patterns and behavioral tendencies to provide personalized and timely interventions. We evaluated AwareLLM through a user study with 20 participants, comparing it to a standard LLM assistant across multiple tasks. Our results show statistically significant improvements in task performance, along with reductions in cognitive fatigue and mental demand. Participants described AwareLLM's personalized interventions as timely and relevant, helping them boost their confidence and deepen engagement with their work. AwareLLM opens new avenues for Human-AI collaboration where technology adapts to our needs rather than us adhering to technological constraints.
RONov 29, 2024
SANGO: Socially Aware Navigation through Grouped ObstaclesRahath Malladi, Amol Harsh, Arshia Sangwan et al.
This paper introduces SANGO (Socially Aware Navigation through Grouped Obstacles), a novel method that ensures socially appropriate behavior by dynamically grouping obstacles and adhering to social norms. Using deep reinforcement learning, SANGO trains agents to navigate complex environments leveraging the DBSCAN algorithm for obstacle clustering and Proximal Policy Optimization (PPO) for path planning. The proposed approach improves safety and social compliance by maintaining appropriate distances and reducing collision rates. Extensive experiments conducted in custom simulation environments demonstrate SANGO's superior performance in significantly reducing discomfort (by up to 83.5%), reducing collision rates (by up to 29.4%) and achieving higher successful navigation in dynamic and crowded scenarios. These findings highlight the potential of SANGO for real-world applications, paving the way for advanced socially adept robotic navigation systems.