Shruti Kumar

h-index1
2papers

2 Papers

CVAug 26, 2023
Gaze-Informed Vision Transformers: Predicting Driving Decisions Under Uncertainty

Sharath Koorathota, Nikolas Papadopoulos, Jia Li Ma et al.

Vision Transformers (ViT) have advanced computer vision, yet their efficacy in complex tasks like driving remains less explored. This study enhances ViT by integrating human eye gaze, captured via eye-tracking, to increase prediction accuracy in driving scenarios under uncertainty in both real-world and virtual reality scenarios. First, we establish the significance of human eye gaze in left-right driving decisions, as observed in both human subjects and a ViT model. By comparing the similarity between human fixation maps and ViT attention weights, we reveal the dynamics of overlap across individual heads and layers. This overlap demonstrates that fixation data can guide the model in distributing its attention weights more effectively. We introduce the fixation-attention intersection (FAX) loss, a novel loss function that significantly improves ViT performance under high uncertainty conditions. Our results show that ViT, when trained with FAX loss, aligns its attention with human gaze patterns. This gaze-informed approach has significant potential for driver behavior analysis, as well as broader applications in human-centered AI systems, extending ViT's use to complex visual environments.

AIMay 30, 2025
Mapping Human-Agent Co-Learning and Co-Adaptation: A Scoping Review

Shruti Kumar, Xiaoyu Chen, Xiaomei Wang

Several papers have delved into the challenges of human-AI-robot co-learning and co-adaptation. It has been noted that the terminology used to describe this collaborative relationship in existing studies needs to be more consistent. For example, the prefix "co" is used interchangeably to represent both "collaborative" and "mutual," and the terms "co-learning" and "co-adaptation" are sometimes used interchangeably. However, they can reflect subtle differences in the focus of the studies. The current scoping review's primary research question (RQ1) aims to gather existing papers discussing this collaboration pattern and examine the terms researchers use to describe this human-agent relationship. Given the relative newness of this area of study, we are also keen on exploring the specific types of intelligent agents and task domains that have been considered in existing research (RQ2). This exploration is significant as it can shed light on the diversity of human-agent interactions, from one-time to continuous learning/adaptation scenarios. It can also help us understand the dynamics of human-agent interactions in different task domains, guiding our expectations towards research situated in dynamic, complex domains. Our third objective (RQ3) is to investigate the cognitive theories and frameworks that have been utilized in existing studies to measure human-agent co-learning and co-adaptation. This investigation is crucial as it can help us understand the theoretical underpinnings of human-agent collaboration and adaptation, and it can also guide us in identifying any new frameworks proposed specifically for this type of relationship.