Xibai Lou

RO
h-index8
5papers
132citations
Novelty57%
AI Score31

5 Papers

ROJan 4, 2025
Attribute-Based Robotic Grasping with Data-Efficient Adaptation

Yang Yang, Houjian Yu, Xibai Lou et al.

Robotic grasping is one of the most fundamental robotic manipulation tasks and has been the subject of extensive research. However, swiftly teaching a robot to grasp a novel target object in clutter remains challenging. This paper attempts to address the challenge by leveraging object attributes that facilitate recognition, grasping, and rapid adaptation to new domains. In this work, we present an end-to-end encoder-decoder network to learn attribute-based robotic grasping with data-efficient adaptation capability. We first pre-train the end-to-end model with a variety of basic objects to learn generic attribute representation for recognition and grasping. Our approach fuses the embeddings of a workspace image and a query text using a gated-attention mechanism and learns to predict instance grasping affordances. To train the joint embedding space of visual and textual attributes, the robot utilizes object persistence before and after grasping. Our model is self-supervised in a simulation that only uses basic objects of various colors and shapes but generalizes to novel objects in new environments. To further facilitate generalization, we propose two adaptation methods, adversarial adaption and one-grasp adaptation. Adversarial adaptation regulates the image encoder using augmented data of unlabeled images, whereas one-grasp adaptation updates the overall end-to-end model using augmented data from one grasp trial. Both adaptation methods are data-efficient and considerably improve instance grasping performance. Experimental results in both simulation and the real world demonstrate that our approach achieves over 81% instance grasping success rate on unknown objects, which outperforms several baselines by large margins.

ROApr 6, 2021
Attribute-Based Robotic Grasping with One-Grasp Adaptation

Yang Yang, Yuanhao Liu, Hengyue Liang et al.

Robotic grasping is one of the most fundamental robotic manipulation tasks and has been actively studied. However, how to quickly teach a robot to grasp a novel target object in clutter remains challenging. This paper attempts to tackle the challenge by leveraging object attributes that facilitate recognition, grasping, and quick adaptation. In this work, we introduce an end-to-end learning method of attribute-based robotic grasping with one-grasp adaptation capability. Our approach fuses the embeddings of a workspace image and a query text using a gated-attention mechanism and learns to predict instance grasping affordances. Besides, we utilize object persistence before and after grasping to learn a joint metric space of visual and textual attributes. Our model is self-supervised in a simulation that only uses basic objects of various colors and shapes but generalizes to novel objects and real-world scenes. We further demonstrate that our model is capable of adapting to novel objects with only one grasp data and improving instance grasping performance significantly. Experimental results in both simulation and the real world demonstrate that our approach achieves over 80\% instance grasping success rate on unknown objects, which outperforms several baselines by large margins.

ROApr 1, 2021
Collision-Aware Target-Driven Object Grasping in Constrained Environments

Xibai Lou, Yang Yang, Changhyun Choi

Grasping a novel target object in constrained environments (e.g., walls, bins, and shelves) requires intensive reasoning about grasp pose reachability to avoid collisions with the surrounding structures. Typical 6-DoF robotic grasping systems rely on the prior knowledge about the environment and intensive planning computation, which is ungeneralizable and inefficient. In contrast, we propose a novel Collision-Aware Reachability Predictor (CARP) for 6-DoF grasping systems. The CARP learns to estimate the collision-free probabilities for grasp poses and significantly improves grasping in challenging environments. The deep neural networks in our approach are trained fully by self-supervision in simulation. The experiments in both simulation and the real world show that our approach achieves more than 75% grasping rate on novel objects in various surrounding structures. The ablation study demonstrates the effectiveness of the CARP, which improves the 6-DoF grasping rate by 95.7%.

ROOct 14, 2019
Learning to Generate 6-DoF Grasp Poses with Reachability Awareness

Xibai Lou, Yang Yang, Changhyun Choi

Motivated by the stringent requirements of unstructured real-world where a plethora of unknown objects reside in arbitrary locations of the surface, we propose a voxel-based deep 3D Convolutional Neural Network (3D CNN) that generates feasible 6-DoF grasp poses in unrestricted workspace with reachability awareness. Unlike the majority of works that predict if a proposed grasp pose within the restricted workspace will be successful solely based on grasp pose stability, our approach further learns a reachability predictor that evaluates if the grasp pose is reachable or not from robot's own experience. To avoid the laborious real training data collection, we exploit the power of simulation to train our networks on a large-scale synthetic dataset. This work is an early attempt that simultaneously evaluates grasping reachability from learned knowledge while proposing feasible grasp poses with 3D CNN. Experimental results in both simulation and real-world demonstrate that our approach outperforms several other methods and achieves 82.5% grasping success rate on unknown objects.

ROOct 9, 2019
Learning Visual Affordances with Target-Orientated Deep Q-Network to Grasp Objects by Harnessing Environmental Fixtures

Hengyue Liang, Xibai Lou, Yang Yang et al.

This paper introduces a challenging object grasping task and proposes a self-supervised learning approach. The goal of the task is to grasp an object which is not feasible with a single parallel gripper, but only with harnessing environment fixtures (e.g., walls, furniture, heavy objects). This Slide-to-Wall grasping task assumes no prior knowledge except the partial observation of a target object. Hence the robot should learn an effective policy given a scene observation that may include the target object, environmental fixtures, and any other disturbing objects. We formulate the problem as visual affordances learning for which Target-Oriented Deep Q-Network (TO-DQN) is proposed to efficiently learn visual affordance maps (i.e., Q-maps) to guide robot actions. Since the training necessitates robot's exploration and collision with the fixtures, TO-DQN is first trained safely with a simulated robot manipulator and then applied to a real robot. We empirically show that TO-DQN can learn to solve the task in different environment settings in simulation and outperforms a standard and a variant of Deep Q-Network (DQN) in terms of training efficiency and robustness. The testing performance in both simulation and real-robot experiments shows that the policy trained by TO-DQN achieves comparable performance to humans.