ROJun 11, 2023
Reinforcement Learning in Robotic Motion Planning by Combined Experience-based Planning and Self-Imitation LearningSha Luo, Lambert Schomaker
High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks. For real robots, it is challenging to collect enough qualified data either as demonstrations for IL or experiences for RL due to safety considerations in environments with obstacles. We target this challenge by proposing the self-imitation learning by planning plus (SILP+) algorithm, which efficiently embeds experience-based planning into the learning architecture to mitigate the data-collection problem. The planner generates demonstrations based on successfully visited states from the current RL policy, and the policy improves by learning from these demonstrations. In this way, we relieve the demand for human expert operators to collect demonstrations required by IL and improve the RL performance as well. Various experimental results show that SILP+ achieves better training efficiency higher and more stable success rate in complex motion planning tasks compared to several other methods. Extensive tests on physical robots illustrate the effectiveness of SILP+ in a physical setting.
CVJun 12, 2024
Refusal as Silence: Gendered Disparities in Vision-Language Model ResponsesSha Luo, Sang Jung Kim, Zening Duan et al.
Refusal behavior by Large Language Models is increasingly visible in content moderation, yet little is known about how refusals vary by the identity of the user making the request. This study investigates refusal as a sociotechnical outcome through a counterfactual persona design that varies gender identity--including male, female, non-binary, and transgender personas--while keeping the classification task and visual input constant. Focusing on a vision-language model (GPT-4V), we examine how identity-based language cues influence refusal in binary gender classification tasks. We find that transgender and non-binary personas experience significantly higher refusal rates, even in non-harmful contexts. Our findings also provide methodological implications for equity audits and content analysis using LLMs. Our findings underscore the importance of modeling identity-driven disparities and caution against uncritical use of AI systems for content coding. This study advances algorithmic fairness by reframing refusal as a communicative act that may unevenly regulate epistemic access and participation.
ROJun 3, 2021
Simultaneous Multi-View Object Recognition and Grasping in Open-Ended DomainsHamidreza Kasaei, Sha Luo, Remo Sasso et al.
To aid humans in everyday tasks, robots need to know which objects exist in the scene, where they are, and how to grasp and manipulate them in different situations. Therefore, object recognition and grasping are two key functionalities for autonomous robots. Most state-of-the-art approaches treat object recognition and grasping as two separate problems, even though both use visual input. Furthermore, the knowledge of the robot is fixed after the training phase. In such cases, if the robot encounters new object categories, it must be retrained to incorporate new information without catastrophic forgetting. In order to resolve this problem, we propose a deep learning architecture with an augmented memory capacity to handle open-ended object recognition and grasping simultaneously. In particular, our approach takes multi-views of an object as input and jointly estimates pixel-wise grasp configuration as well as a deep scale- and rotation-invariant representation as output. The obtained representation is then used for open-ended object recognition through a meta-active learning technique. We demonstrate the ability of our approach to grasp never-seen-before objects and to rapidly learn new object categories using very few examples on-site in both simulation and real-world settings. A video of these experiments is available online at: https://youtu.be/n9SMpuEkOgk
ROMar 25, 2021
Self-Imitation Learning by PlanningSha Luo, Hamidreza Kasaei, Lambert Schomaker
Imitation learning (IL) enables robots to acquire skills quickly by transferring expert knowledge, which is widely adopted in reinforcement learning (RL) to initialize exploration. However, in long-horizon motion planning tasks, a challenging problem in deploying IL and RL methods is how to generate and collect massive, broadly distributed data such that these methods can generalize effectively. In this work, we solve this problem using our proposed approach called {self-imitation learning by planning (SILP)}, where demonstration data are collected automatically by planning on the visited states from the current policy. SILP is inspired by the observation that successfully visited states in the early reinforcement learning stage are collision-free nodes in the graph-search based motion planner, so we can plan and relabel robot's own trials as demonstrations for policy learning. Due to these self-generated demonstrations, we relieve the human operator from the laborious data preparation process required by IL and RL methods in solving complex motion planning tasks. The evaluation results show that our SILP method achieves higher success rates and enhances sample efficiency compared to selected baselines, and the policy learned in simulation performs well in a real-world placement task with changing goals and obstacles.
AIFeb 7, 2020
Accelerating Reinforcement Learning for Reaching using Continuous Curriculum LearningSha Luo, Hamidreza Kasaei, Lambert Schomaker
Reinforcement learning has shown great promise in the training of robot behavior due to the sequential decision making characteristics. However, the required enormous amount of interactive and informative training data provides the major stumbling block for progress. In this study, we focus on accelerating reinforcement learning (RL) training and improving the performance of multi-goal reaching tasks. Specifically, we propose a precision-based continuous curriculum learning (PCCL) method in which the requirements are gradually adjusted during the training process, instead of fixing the parameter in a static schedule. To this end, we explore various continuous curriculum strategies for controlling a training process. This approach is tested using a Universal Robot 5e in both simulation and real-world multi-goal reach experiments. Experimental results support the hypothesis that a static training schedule is suboptimal, and using an appropriate decay function for curriculum learning provides superior results in a faster way.
ROMay 14, 2018
Distributed Circumnavigation Control with Dynamic Spacings for a Heterogeneous Multi-robot SystemWeijia Yao, Sha Luo, Huimin Lu et al.
Circumnavigation control is useful in real-world applications such as entrapping a hostile target. In this paper, we consider a heterogeneous multi-robot system where robots have different physical properties, such as maximum movement speeds. Instead of equal-spacings, dynamic spacings according to robots' properties, which are termed utilities in this paper, will be more desirable in a scenario such as target entrapment. A distributed circumnavigation control algorithm based on utilities is proposed for any number of mobile robots from random 3D positions to circumnavigate a target. The dynamic spacings are subject to the variation of robots' utilities. The robots can only obtain the angular positions and utilities of their two neighbouring robots, so the control law is distributed. Theoretical analysis and experimental results are provided to prove the stability and effectiveness of the proposed control algorithm.