ROMar 3, 2025
RECON: Reducing Causal Confusion with Human-Placed MarkersRobert Ramirez Sanchez, Heramb Nemlekar, Shahabedin Sagheb et al.
Imitation learning enables robots to learn new tasks from human examples. One fundamental limitation while learning from humans is causal confusion. Causal confusion occurs when the robot's observations include both task-relevant and extraneous information: for instance, a robot's camera might see not only the intended goal, but also clutter and changes in lighting within its environment. Because the robot does not know which aspects of its observations are important a priori, it often misinterprets the human's examples and fails to learn the desired task. To address this issue, we highlight that -- while the robot learner may not know what to focus on -- the human teacher does. In this paper we propose that the human proactively marks key parts of their task with small, lightweight beacons. Under our framework (RECON) the human attaches these beacons to task-relevant objects before providing demonstrations: as the human shows examples of the task, beacons track the position of marked objects. We then harness this offline beacon data to train a task-relevant state embedding. Specifically, we embed the robot's observations to a latent state that is correlated with the measured beacon readings: in practice, this causes the robot to autonomously filter out extraneous observations and make decisions based on features learned from the beacon data. Our simulations and a real robot experiment suggest that this framework for human-placed beacons mitigates causal confusion. Indeed, we find that using RECON significantly reduces the number of demonstrations needed to convey the task, lowering the overall time required for human teaching. See videos here: https://youtu.be/oy85xJvtLSU
ROApr 24, 2025
CIVIL: Causal and Intuitive Visual Imitation LearningYinlong Dai, Robert Ramirez Sanchez, Ryan Jeronimus et al.
Today's robots attempt to learn new tasks by imitating human examples. These robots watch the human complete the task, and then try to match the actions taken by the human expert. However, this standard approach to visual imitation learning is fundamentally limited: the robot observes what the human does, but not why the human chooses those behaviors. Without understanding which features of the system or environment factor into the human's decisions, robot learners often misinterpret the human's examples. In practice, this results in causal confusion, inefficient learning, and robot policies that fail when the environment changes. We therefore propose a shift in perspective: instead of asking human teachers just to show what actions the robot should take, we also enable humans to intuitively indicate why they made those decisions. Under our paradigm human teachers attach markers to task-relevant objects and use natural language prompts to describe their state representation. Our proposed algorithm, CIVIL, leverages this augmented demonstration data to filter the robot's visual observations and extract a feature representation that aligns with the human teacher. CIVIL then applies these causal features to train a transformer-based policy that -- when tested on the robot -- is able to emulate human behaviors without being confused by visual distractors or irrelevant items. Our simulations and real-world experiments demonstrate that robots trained with CIVIL learn both what actions to take and why to take those actions, resulting in better performance than state-of-the-art baselines. From the human's perspective, our user study reveals that this new training paradigm actually reduces the total time required for the robot to learn the task, and also improves the robot's performance in previously unseen scenarios. See videos at our project website: https://civil2025.github.io
HCMar 26, 2021
Data-driven sparse skin stimulation can convey social touch information to humansM. Salvato, Sophia R. Williams, Cara M. Nunez et al.
During social interactions, people use auditory, visual, and haptic cues to convey their thoughts, emotions, and intentions. Due to weight, energy, and other hardware constraints, it is difficult to create devices that completely capture the complexity of human touch. Here we explore whether a sparse representation of human touch is sufficient to convey social touch signals. To test this we collected a dataset of social touch interactions using a soft wearable pressure sensor array, developed an algorithm to map recorded data to an array of actuators, then applied our algorithm to create signals that drive an array of normal indentation actuators placed on the arm. Using this wearable, low-resolution, low-force device, we find that users are able to distinguish the intended social meaning, and compare performance to results based on direct human touch. As online communication becomes more prevalent, such systems to convey haptic signals could allow for improved distant socializing and empathetic remote human-human interaction.
HCMar 2, 2020
Investigating Social Haptic Illusions for Tactile Stroking (SHIFTS)Cara M. Nunez, Bryce N. Huerta, Allison M. Okamura et al.
A common and effective form of social touch is stroking on the forearm. We seek to replicate this stroking sensation using haptic illusions. This work compares two methods that provide sequential discrete stimulation: sequential normal indentation and sequential lateral skin-slip using discrete actuators. Our goals are to understand which form of stimulation more effectively creates a continuous stroking sensation, and how many discrete contact points are needed. We performed a study with 20 participants in which they rated sensations from the haptic devices on continuity and pleasantness. We found that lateral skin-slip created a more continuous sensation, and decreasing the number of contact points decreased the continuity. These results inform the design of future wearable haptic devices and the creation of haptic signals for effective social communication.
HCSep 3, 2019
Understanding Continuous and Pleasant Linear Sensations on the Forearm from a Sequential Discrete Lateral Skin-Slip Haptic DeviceCara M. Nunez, Sophia R. Williams, Allison M. Okamura et al.
A continuous stroking sensation on the skin can convey messages or emotion cues. We seek to induce this sensation using a combination of illusory motion and lateral stroking via a haptic device. Our system provides discrete lateral skin-slip on the forearm with rotating tactors, which independently provide lateral skin-slip in a timed sequence. We vary the sensation by changing the angular velocity and delay between adjacent tactors, such that the apparent speed of the perceived stroke ranges from 2.5 to 48.2 cm/s. We investigated which actuation parameters create the most pleasant and continuous sensations through a user study with 16 participants. On average, the sensations were rated by participants as both continuous and pleasant. The most continuous and pleasant sensations were created by apparent speeds of 7.7 and 5.1 cm/s, respectively. We also investigated the effect of spacing between contact points on the pleasantness and continuity of the stroking sensation, and found that the users experience a pleasant and continuous linear sensation even when the space between contact points is relatively large (40 mm). Understanding how sequential discrete lateral skin-slip creates continuous linear sensations can influence the design and control of future wearable haptic devices.