Jingdong Zhao

h-index20

3papers

10citations

Novelty50%

AI Score31

Ranked #132,343 of 194,257 authors (top 68%)#3,955 in RO (top 59%)

3 Papers

4.1ROApr 30, 2024

Transformer-Enhanced Motion Planner: Attention-Guided Sampling for State-Specific Decision Making

Lei Zhuang, Jingdong Zhao, Yuntao Li et al.

Sampling-based motion planning (SBMP) algorithms are renowned for their robust global search capabilities. However, the inherent randomness in their sampling mechanisms often result in inconsistent path quality and limited search efficiency. In response to these challenges, this work proposes a novel deep learning-based motion planning framework, named Transformer-Enhanced Motion Planner (TEMP), which synergizes an Environmental Information Semantic Encoder (EISE) with a Motion Planning Transformer (MPT). EISE converts environmental data into semantic environmental information (SEI), providing MPT with an enriched environmental comprehension. MPT leverages an attention mechanism to dynamically recalibrate its focus on SEI, task objectives, and historical planning data, refining the sampling node generation. To demonstrate the capabilities of TEMP, we train our model using a dataset comprised of planning results produced by the RRT*. EISE and MPT are collaboratively trained, enabling EISE to autonomously learn and extract patterns from environmental data, thereby forming semantic representations that MPT could more effectively interpret and utilize for motion planning. Subsequently, we conducted a systematic evaluation of TEMP's efficacy across diverse task dimensions, which demonstrates that TEMP achieves exceptional performance metrics and a heightened degree of generalizability compared to state-of-the-art SBMPs.

3.6CVJul 12, 2025

Multimodal Visual Transformer for Sim2real Transfer in Visual Reinforcement Learning

Zichun Xu, Yuntao Li, Zhaomin Wang et al.

Depth information is robust to scene appearance variations and inherently carries 3D spatial details. In this paper, a visual backbone based on the vision transformer is proposed to fuse RGB and depth modalities for enhancing generalization. Different modalities are first processed by separate CNN stems, and the combined convolutional features are delivered to the scalable vision transformer to obtain visual representations. Moreover, a contrastive unsupervised learning scheme is designed with masked and unmasked tokens to accelerate the sample efficiency during the reinforcement learning process. Simulation results demonstrate that our visual backbone can focus more on task-related regions and exhibit better generalization in unseen scenarios. For sim2real transfer, a flexible curriculum learning schedule is developed to deploy domain randomization over training processes. Finally, the feasibility of our model is validated to perform real-world manipulation tasks via zero-shot transfer.

1.6RONov 1, 2018

Collision-Free Kinematics for Redundant Manipulators in Dynamic Scenes using Optimal Reciprocal Velocity Obstacles

Liangliang Zhao, Jingdong Zhao, Hong Liu et al.

We present a novel algorithm for collision-free kinematics of multiple manipulators in a shared workspace with moving obstacles. Our optimization-based approach simultaneously handles collision-free constraints based on reciprocal velocity obstacles and inverse kinematics constraints for high-DOF manipulators. We present an efficient method based on particle swarm optimization that can generate collision-free configurations for each redundant manipulator. Furthermore, our approach can be used to compute safe and oscillation-free trajectories in a few milli-seconds. We highlight the real-time performance of our algorithm on multiple Baxter robots with 14-DOF manipulators operating in a workspace with dynamic obstacles. Videos are available at https://sites.google.com/view/collision-free-kinematics