Jamison Heard

RO
h-index14
9papers
19citations
Novelty45%
AI Score48

9 Papers

53.1ROApr 4Code
Optimizing Neurorobot Policy under Limited Demonstration Data through Preference Regret

Viet Dung Nguyen, Yuhang Song, Anh Nguyen et al.

Robot reinforcement learning from demonstrations (RLfD) assumes that expert data is abundant; this is usually unrealistic in the real world given data scarcity as well as high collection cost. Furthermore, imitation learning algorithms assume that the data is independently and identically distributed, which ultimately results in poorer performance as gradual errors emerge and compound within test-time trajectories. We address these issues by introducing the "master your own expertise" (MYOE) framework, a self-imitation framework that enables robotic agents to learn complex behaviors from limited demonstration data samples. Inspired by human perception and action, we propose and design what we call the queryable mixture-of-preferences state space model (QMoP-SSM), which estimates the desired goal at every time step. These desired goals are used in computing the "preference regret", which is used to optimize the robot control policy. Our experiments demonstrate the robustness, adaptability, and out-of-sample performance of our agent compared to other state-of-the-art RLfD schemes. The GitHub repository that supports this work can be found at: https://github.com/rxng8/neurorobot-preference-regret-learning.

ROAug 28, 2023
Human Comfortability Index Estimation in Industrial Human-Robot Collaboration Task

Celal Savur, Jamison Heard, Ferat Sahin

Fluent human-robot collaboration requires a robot teammate to understand, learn, and adapt to the human's psycho-physiological state. Such collaborations require a computing system that monitors human physiological signals during human-robot collaboration (HRC) to quantitatively estimate a human's level of comfort, which we have termed in this research as comfortability index (CI) and uncomfortability index (unCI). Subjective metrics (surprise, anxiety, boredom, calmness, and comfortability) and physiological signals were collected during a human-robot collaboration experiment that varied robot behavior. The emotion circumplex model is adapted to calculate the CI from the participant's quantitative data as well as physiological data. To estimate CI/unCI from physiological signals, time features were extracted from electrocardiogram (ECG), galvanic skin response (GSR), and pupillometry signals. In this research, we successfully adapt the circumplex model to find the location (axis) of 'comfortability' and 'uncomfortability' on the circumplex model, and its location match with the closest emotions on the circumplex model. Finally, the study showed that the proposed approach can estimate human comfortability/uncomfortability from physiological signals.

23.9HCMay 14
Understanding How International Students in the U.S. Are Using Conversational AI to Support Cross-Cultural Adaptation

Laleh Nourian, Anisa Callis, Stephanie Patterson et al.

Moving to a new culture and adapting to a new life, as an international student, can be a stressful experience. In the US, international students face unique overlapping challenges, yet the current support ecosystem, including university support systems and informal social networks, remains largely fragmented. While conversational AI has emerged as a tool used by many (e.g., generative AI chatbots like ChatGPT and Google Gemini), we do not have a clear understanding of how international students adopt and perceive these technologies as support tools. We conducted a survey study (n=60) to map the relationship between international students' challenges and AI adoption patterns, followed by an interview study with 14 participants to identify the underlying motivations and boundaries of use. Our findings show that AI is perceived as a first-aid tool for immediate challenges, however, there is an interest in transforming AI from a tool for short-term help into a long-term support companion. By identifying where and how AI can provide long-term support, and where it is insufficient, we contribute recommendations for creating AI-powered support tailored to the unique needs of international students.

16.2LGMay 12
Intrinsic Vicarious Conditioning for Deep Reinforcement Learning

Rodney A Sanchez, Ferat Sahin, Alex Ororbia et al.

Advancements in reinforcement learning have produced a variety of complex and useful intrinsic driving forces; crucially, these drivers operate under a direct conditioning paradigm. This form of conditioning limits our agents' capacity by restricting how they learn from the environment as well as from others. Off-policy or learn-by-example methods can learn from demonstrators' representations, but they require access to the demonstrating agent's policies or their reward functions. Our work overcomes this direct sampling limitation by introducing vicarious conditioning as an intrinsic reward mechanism. We draw from psychological and biological literature to provide a foundation for vicarious conditioning and use memory-based methods to implement its four steps: attention, retention, reproduction, and reinforcement. Crucially, our vicarious conditioning paradigms support low-shot learning and do not require the demonstrator agent's policy nor its reward functions. We evaluate our approach in the MiniWorld Sidewalk environment, one of the few public environments that features a non-descriptive terminal condition (no reward provided upon agent death), and extend it to Box2D's CarRacing environment. Our results across both environments demonstrate that vicarious conditioning enables longer episode lengths by discouraging the agent from non-descriptive terminal conditions and guiding the agent toward desirable states. Overall, this work emulates a cognitively-plausible learning paradigm better suited to problems such as single-life learning or continual learning.

CVMay 30, 2025Code
Shuffle PatchMix Augmentation with Confidence-Margin Weighted Pseudo-Labels for Enhanced Source-Free Domain Adaptation

Prasanna Reddy Pulakurthi, Majid Rabbani, Jamison Heard et al.

This work investigates Source-Free Domain Adaptation (SFDA), where a model adapts to a target domain without access to source data. A new augmentation technique, Shuffle PatchMix (SPM), and a novel reweighting strategy are introduced to enhance performance. SPM shuffles and blends image patches to generate diverse and challenging augmentations, while the reweighting strategy prioritizes reliable pseudo-labels to mitigate label noise. These techniques are particularly effective on smaller datasets like PACS, where overfitting and pseudo-label noise pose greater risks. State-of-the-art results are achieved on three major benchmarks: PACS, VisDA-C, and DomainNet-126. Notably, on PACS, improvements of 7.3% (79.4% to 86.7%) and 7.2% are observed in single-target and multi-target settings, respectively, while gains of 2.8% and 0.7% are attained on DomainNet-126 and VisDA-C. This combination of advanced augmentation and robust pseudo-label reweighting establishes a new benchmark for SFDA. The code is available at: https://github.com/PrasannaPulakurthi/SPM

CVJan 15, 2024
Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

Manish Sharma, Jamison Heard, Eli Saber et al.

While Convolutional Neural Networks (CNNs) excel at learning complex latent-space representations, their over-parameterization can lead to overfitting and reduced performance, particularly with limited data. This, alongside their high computational and memory demands, limits the applicability of CNNs for edge deployment. Low-rank matrix approximation has emerged as a promising approach to reduce CNN parameters, but its application presents challenges including rank selection and performance loss. To address these issues, we propose an efficient training method for CNN compression via dynamic parameter rank pruning. Our approach integrates efficient matrix factorization and novel regularization techniques, forming a robust framework for dynamic rank reduction and model compression. We use Singular Value Decomposition (SVD) to model low-rank convolutional filters and dense weight matrices and we achieve model compression by training the SVD factors with back-propagation in an end-to-end way. We evaluate our method on an array of modern CNNs, including ResNet-18, ResNet-20, and ResNet-32, and datasets like CIFAR-10, CIFAR-100, and ImageNet (2012), showcasing its applicability in computer vision. Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.

ROJul 8, 2025
Robust Speech-Workload Estimation for Intelligent Human-Robot Systems

Julian Fortune, Julie A. Adams, Jamison Heard

Demanding task environments (e.g., supervising a remotely piloted aircraft) require performing tasks quickly and accurately; however, periods of low and high operator workload can decrease task performance. Intelligent modulation of the system's demands and interaction modality in response to changes in operator workload state may increase performance by avoiding undesirable workload states. This system requires real-time estimation of each workload component (i.e., cognitive, physical, visual, speech, and auditory) to adapt the correct modality. Existing workload systems estimate multiple workload components post-hoc, but few estimate speech workload, or function in real-time. An algorithm to estimate speech workload and mitigate undesirable workload states in real-time is presented. An analysis of the algorithm's accuracy is presented, along with the results demonstrating the algorithm's generalizability across individuals and human-machine teaming paradigms. Real-time speech workload estimation is a crucial element towards developing adaptive human-machine systems.

AIJun 5, 2025
Avoiding Death through Fear Intrinsic Conditioning

Rodney Sanchez, Ferat Sahin, Alexander Ororbia et al.

Biological and psychological concepts have inspired reinforcement learning algorithms to create new complex behaviors that expand agents' capacity. These behaviors can be seen in the rise of techniques like goal decomposition, curriculum, and intrinsic rewards, which have paved the way for these complex behaviors. One limitation in evaluating these methods is the requirement for engineered extrinsic for realistic environments. A central challenge in engineering the necessary reward function(s) comes from these environments containing states that carry high negative rewards, but provide no feedback to the agent. Death is one such stimuli that fails to provide direct feedback to the agent. In this work, we introduce an intrinsic reward function inspired by early amygdala development and produce this intrinsic reward through a novel memory-augmented neural network (MANN) architecture. We show how this intrinsic motivation serves to deter exploration of terminal states and results in avoidance behavior similar to fear conditioning observed in animals. Furthermore, we demonstrate how modifying a threshold where the fear response is active produces a range of behaviors that are described under the paradigm of general anxiety disorders (GADs). We demonstrate this behavior in the Miniworld Sidewalk environment, which provides a partially observable Markov decision process (POMDP) and a sparse reward with a non-descriptive terminal condition, i.e., death. In effect, this study results in a biologically-inspired neural architecture and framework for fear conditioning paradigms; we empirically demonstrate avoidance behavior in a constructed agent that is able to solve environments with non-descriptive terminal conditions.

ROMar 12, 2020
SAHRTA: A Supervisory-Based Adaptive Human-Robot Teaming Architecture

Jamison Heard, Julian Fortune, Julie A. Adams

Supervisory-based human-robot teams are deployed in various dynamic and extreme environments (e.g., space exploration). Achieving high task performance in such environments is critical, as a mistake may lead to significant monetary loss or human injury. Task performance may be augmented by adapting the supervisory interface's interactions or autonomy levels based on the human supervisor's workload level, as workload is related to task performance. Typical adaptive systems rely solely on the human's overall or cognitive workload state to select what adaptation strategy to implement; however, overall workload encompasses many dimensions (i.e., cognitive, physical, visual, auditory, and speech) called workload components. Selecting an appropriate adaptation strategy based on a complete human workload state (rather than a single workload dimension) may allow for more impactful adaptations that ensure high task performance. A Supervisory-Based Adaptive Human-Robot Teaming Architecture (SAHRTA) that selects an appropriate level of autonomy or system interaction based on a complete real-time multi-dimensional workload estimate and predicted future task performance is introduced. SAHRTA was shown to improve overall task performance in a physically expanded version of the NASA Multi-Attribute Task Battery.