Jingdong Zhao

RO
h-index18
4papers
10citations
Novelty45%
AI Score39

4 Papers

43.5ROMay 17
A Visual Reinforcement Learning-Based Separate Primitive Policy for Peg-in-Hole Tasks

Zichun Xu, Zhaomin Wang, Yuntao Li et al.

For peg-in-hole tasks, humans rely on binocular visual perception to locate the peg above the hole surface and then proceed with insertion. This paper draws insights from this behavior to enable agents to learn efficient assembly strategies through visual reinforcement learning. Hence, we propose a Separate Primitive Policy (S2P) to learn how to derive location and insertion actions simultaneously. S2P is compatible with model-free reinforcement learning algorithms. Ten insertion tasks featuring different polygons are developed as benchmarks for evaluations. Simulation experiments show that S2P can boost the sample efficiency and success rate even with force constraints. Real-world experiments are also performed to verify the feasibility of S2P. Ablations are finally given to discuss the generalizability of S2P and some factors that affect its performance.

ROApr 30, 2024
Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making

Lei Zhuang, Jingdong Zhao, Yuntao Li et al.

Sampling-based motion planning (SBMP) algorithms are renowned for their robust global search capabilities. However, the inherent randomness in their sampling mechanisms often result in inconsistent path quality and limited search efficiency. In response to these challenges, this work proposes a novel deep learning-based motion planning framework, named Transformer-Enhanced Motion Planner (TEMP), which synergizes an Environmental Information Semantic Encoder (EISE) with a Motion Planning Transformer (MPT). EISE converts environmental data into semantic environmental information (SEI), providing MPT with an enriched environmental comprehension. MPT leverages an attention mechanism to dynamically recalibrate its focus on SEI, task objectives, and historical planning data, refining the sampling node generation. To demonstrate the capabilities of TEMP, we train our model using a dataset comprised of planning results produced by the RRT*. EISE and MPT are collaboratively trained, enabling EISE to autonomously learn and extract patterns from environmental data, thereby forming semantic representations that MPT could more effectively interpret and utilize for motion planning. Subsequently, we conducted a systematic evaluation of TEMP's efficacy across diverse task dimensions, which demonstrates that TEMP achieves exceptional performance metrics and a heightened degree of generalizability compared to state-of-the-art SBMPs.

CVJul 12, 2025
Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning

Zichun Xu, Yuntao Li, Zhaomin Wang et al.

Depth information is robust to scene appearance variations and inherently carries 3D spatial details. In this paper, a visual backbone based on the vision transformer is proposed to fuse RGB and depth modalities for enhancing generalization. Different modalities are first processed by separate CNN stems, and the combined convolutional features are delivered to the scalable vision transformer to obtain visual representations. Moreover, a contrastive unsupervised learning scheme is designed with masked and unmasked tokens to accelerate the sample efficiency during the reinforcement learning process. Simulation results demonstrate that our visual backbone can focus more on task-related regions and exhibit better generalization in unseen scenarios. For sim2real transfer, a flexible curriculum learning schedule is developed to deploy domain randomization over training processes. Finally, the feasibility of our model is validated to perform real-world manipulation tasks via zero-shot transfer.

RONov 1, 2018
Collision-Free Kinematics for Redundant Manipulators in Dynamic Scenes using Optimal Reciprocal Velocity Obstacles

Liangliang Zhao, Jingdong Zhao, Hong Liu et al.

We present a novel algorithm for collision-free kinematics of multiple manipulators in a shared workspace with moving obstacles. Our optimization-based approach simultaneously handles collision-free constraints based on reciprocal velocity obstacles and inverse kinematics constraints for high-DOF manipulators. We present an efficient method based on particle swarm optimization that can generate collision-free configurations for each redundant manipulator. Furthermore, our approach can be used to compute safe and oscillation-free trajectories in a few milli-seconds. We highlight the real-time performance of our algorithm on multiple Baxter robots with 14-DOF manipulators operating in a workspace with dynamic obstacles. Videos are available at https://sites.google.com/view/collision-free-kinematics