71.9HCApr 27
AFA: Identity-Aware Memory for Preventing Persona Confusion in Multi-User DialogueMohammad Al-Ratrout, Pavan Uttej Ravva, Shayla Sharmin et al.
When multiple people share a single voice assistant, the system conflates their histories: one resident's preferences can leak into another's responses, eroding utility and trust. We call this failure mode persona confusion, and we show it is a measurable problem in today's single-user dialogue systems when deployed in shared environments. We present the Adaptive Friend Agent (AFA), a modular framework that combines voice-based speaker identification with per-user memory stores to enable identity-aware, personalized dialogue across multiple users. To support training and evaluation, we construct PAT (Personalized Agent chaT), a synthetic dataset of 58,289 persona-grounded dialogue turns spanning 133 user profiles and 12 real-world scenarios. We evaluate AFA across five LLM back-ends in a standard response-quality benchmark, with a LLaMA-2-70B model fine-tuned on PAT achieving the highest overall performance. To directly measure persona confusion prevention, we introduce an interleaved multi-user evaluation protocol with a novel metric, Persona Attribution Accuracy (PAA), demonstrating that identity-aware routing improves PAA from 35.7% to 61.3%. Human evaluation confirms annotators perceive significantly higher personalization in routing-enabled responses. Our results establish that identity-aware user routing is the critical component for preventing persona confusion in multi-user conversational systems.
32.0HCApr 9
Beyond Cognitive Load: AI-Based Estimation of Cognitive Effort Using Brain Signals During Digital TasksShayla Sharmin, Mohammad Fahim Abrar, Gael Lucero-Palacios et al.
Cognitive effort, defined as the relationship between cognitive load and task performance, provides insight into how individuals allocate mental resources during demanding tasks. This construct is particularly important in high-stakes public health and clinical training, where excessive cognitive load is associated with medical errors and burnout. This study investigates whether cognitive effort varies across task segments and whether it can be estimated at the individual level using brain signal data and machine learning. Functional near-infrared spectroscopy (fNIRS) data were collected from 16 participants performing a structured digital cognitive task consisting of four sequential segments separated by short and long rest intervals. Cognitive effort was operationalized using relative neural efficiency and relative neural involvement, integrating prefrontal hemodynamic activity with task performance. The analysis followed a two-stage approach. First, segment-level group analysis tested whether cognitive effort differed across task segments, assessing whether the task structure induced meaningful variation in cognitive demand. Second, participant-independent machine learning models were used to predict task performance from brain signal features. These predicted scores were then combined with neural measures to estimate individual-level cognitive effort. Results showed significant differences in cognitive effort across the four task segments, indicating that variations in task structure influence collective cognitive efficiency. In addition, machine learning models successfully predicted performance from fNIRS data. Cognitive effort derived from predicted scores closely matched that based on actual performance, suggesting that the proposed metric primarily reflects brain signal patterns.
HCApr 3, 2025
Hybrid Deep Learning Model to Estimate Cognitive Effort from fNIRS SignalsShayla Sharmin, Roghayeh Leila Barmaki
This study estimates cognitive effort based on functional near-infrared spectroscopy data and performance scores using a hybrid DeepNet model. The estimation of cognitive effort enables educators to modify material to enhance learning effectiveness and student engagement. In this study, we collected oxygenated hemoglobin using functional near-infrared spectroscopy during an educational quiz game. Participants (n=16) responded to 16 questions in a Unity-based educational game, each within a 30-second response time limit. We used DeepNet models to predict the performance score from the oxygenated hemoglobin, and compared traditional machine learning and DeepNet models to determine which approach provides better accuracy in predicting performance scores. The result shows that the proposed CNN-GRU gives better performance with 73% than other models. After the prediction, we used the predicted score and the oxygenated hemoglobin to observe cognitive effort by calculating relative neural efficiency and involvement in our test cases. Our result shows that even with moderate accuracy, the predicted cognitive effort closely follow the actual trends. This findings can be helpful in designing and improving learning environments and provide valuable insights into learning materials.