Lei Yan

7papers

113citations

Novelty41%

AI Score40

Ranked #99,906 of 205,806 authors (top 49%)#3,404 in RO (top 45%)

7 Papers

ROOct 13, 2022Code

ROS-PyBullet Interface: A Framework for Reliable Contact Simulation and Human-Robot Interaction

Christopher E. Mower, Theodoros Stouraitis, João Moura et al.

Reliable contact simulation plays a key role in the development of (semi-)autonomous robots, especially when dealing with contact-rich manipulation scenarios, an active robotics research topic. Besides simulation, components such as sensing, perception, data collection, robot hardware control, human interfaces, etc. are all key enablers towards applying machine learning algorithms or model-based approaches in real world systems. However, there is a lack of software connecting reliable contact simulation with the larger robotics ecosystem (i.e. ROS, Orocos), for a more seamless application of novel approaches, found in the literature, to existing robotic hardware. In this paper, we present the ROS-PyBullet Interface, a framework that provides a bridge between the reliable contact/impact simulator PyBullet and the Robot Operating System (ROS). Furthermore, we provide additional utilities for facilitating Human-Robot Interaction (HRI) in the simulated environment. We also present several use-cases that highlight the capabilities and usefulness of our framework. Please check our video, source code, and examples included in the supplementary material. Our full code base is open source and can be found at https://github.com/cmower/ros_pybullet_interface.

LGJul 5, 2023

Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure Sensing

Qiulei Wang, Lei Yan, Gang Hu et al.

This study proposes a self-learning algorithm for closed-loop cylinder wake control targeting lower drag and lower lift fluctuations with the additional challenge of sparse sensor information, taking deep reinforcement learning as the starting point. DRL performance is significantly improved by lifting the sensor signals to dynamic features (DF), which predict future flow states. The resulting dynamic feature-based DRL (DF-DRL) automatically learns a feedback control in the plant without a dynamic model. Results show that the drag coefficient of the DF-DRL model is 25% less than the vanilla model based on direct sensor feedback. More importantly, using only one surface pressure sensor, DF-DRL can reduce the drag coefficient to a state-of-the-art performance of about 8% at Re = 100 and significantly mitigate lift coefficient fluctuations. Hence, DF-DRL allows the deployment of sparse sensing of the flow without degrading the control performance. This method also shows good robustness in controlling flow under higher Reynolds numbers, which reduces the drag coefficient by 32.2% and 46.55% at Re = 500 and 1000, respectively, indicating the broad applicability of the method. Since surface pressure information is more straightforward to measure in realistic scenarios than flow velocity information, this study provides a valuable reference for experimentally designing the active flow control of a circular cylinder based on wall pressure signals, which is an essential step toward further developing intelligent control in realistic multi-input multi-output (MIMO) system.

AISep 27, 2022

Reinforcement Learning for Cognitive Delay/Disruption Tolerant Network Node Management in an LEO-based Satellite Constellation

Xue Sun, Changhao Li, Lei Yan et al.

In recent years, with the large-scale deployment of space spacecraft entities and the increase of satellite onboard capabilities, delay/disruption tolerant network (DTN) emerged as a more robust communication protocol than TCP/IP in the case of excessive network dynamics. DTN node buffer management is still an active area of research, as the current implementation of the DTN core protocol still relies on the assumption that there is always enough memory available in different network nodes to store and forward bundles. In addition, the classical queuing theory does not apply to the dynamic management of DTN node buffers. Therefore, this paper proposes a centralized approach to automatically manage cognitive DTN nodes in low earth orbit (LEO) satellite constellation scenarios based on the advanced reinforcement learning (RL) strategy advantage actor-critic (A2C). The method aims to explore training a geosynchronous earth orbit intelligent agent to manage all DTN nodes in an LEO satellite constellation scenario. The goal of the A2C agent is to maximize delivery success rate and minimize network resource consumption cost while considering node memory utilization. The intelligent agent can dynamically adjust the radio data rate and perform drop operations based on bundle priority. In order to measure the effectiveness of applying A2C technology to DTN node management issues in LEO satellite constellation scenarios, this paper compares the trained intelligent agent strategy with the other two non-RL policies, including random and standard policies. Experiments show that the A2C strategy balances delivery success rate and cost, and provides the highest reward and the lowest node memory utilization.

82.7LGApr 9

HiFloat4 Format for Language Model Pre-training on Ascend NPUs

Mehran Taghian, Yunke Peng, Xing Huang et al.

Large foundation models have become central to modern machine learning, with performance scaling predictably with model size and data. However, training and deploying such models incur substantial computational and memory costs, motivating the development of low-precision training techniques. Recent work has demonstrated that 4-bit floating-point (FP4) formats--such as MXFP4 and NVFP4--can be successfully applied to linear GEMM operations in large language models (LLMs), achieving up to 4x improvements in compute throughput and memory efficiency compared to higher-precision baselines. In this work, we investigate the recently proposed HiFloat4 FP4 format for Huawei Ascend NPUs and systematically compare it with MXFP4 in large-scale training settings. All experiments are conducted on Ascend NPU clusters, with linear and expert GEMM operations performed entirely in FP4 precision. We evaluate both dense architectures (e.g., Pangu and LLaMA-style models) and mixture-of-experts (MoE) models, where both standard linear layers and expert-specific GEMMs operate in FP4. Furthermore, we explore stabilization techniques tailored to FP4 training that significantly reduce numerical degradation, maintaining relative error within 1% of full-precision baselines while preserving the efficiency benefits of 4-bit computation. Our results provide a comprehensive empirical study of FP4 training on NPUs and highlight the practical trade-offs between FP4 formats in large-scale dense and MoE models.

ROFeb 7, 2021

Decentralized Ability-Aware Adaptive Control for Multi-robot Collaborative Manipulation

Lei Yan, Theodoros Stouraitis, Sethu Vijayakumar

Multi-robot teams can achieve more dexterous, complex and heavier payload tasks than a single robot, yet effective collaboration is required. Multi-robot collaboration is extremely challenging due to the different kinematic and dynamics capabilities of the robots, the limited communication between them, and the uncertainty of the system parameters. In this paper, a Decentralized Ability-Aware Adaptive Control is proposed to address these challenges based on two key features. Firstly, the common manipulation task is represented by the proposed nominal task ellipsoid, which is used to maximize each robot force capability online via optimizing its configuration. Secondly, a decentralized adaptive controller is designed to be Lyapunov stable in spite of heterogeneous actuation constraints of the robots and uncertain physical parameters of the object and environment. In the proposed framework, decentralized coordination and load distribution between the robots is achieved without communication, while only the control deficiency is broadcast if any of the robots reaches its force limits. In this case, the object reference trajectory is modified in a decentralized manner to guarantee stable interaction. Finally, we perform several numerical and physical simulations to analyse and verify the proposed method with heterogeneous multi-robot teams in collaborative manipulation tasks.

ROJun 23, 2020

Multi-mode Trajectory Optimization for Impact-aware Manipulation

Theodoros Stouraitis, Lei Yan, João Moura et al.

The transition from free motion to contact is a challenging problem in robotics, in part due to its hybrid nature. Additionally, disregarding the effects of impacts at the motion planning level often results in intractable impulsive contact forces. In this paper, we introduce an impact-aware multi-mode trajectory optimization (TO) method that combines hybrid dynamics and hybrid control in a coherent fashion. A key concept is the incorporation of an explicit contact force transmission model in the TO method. This allows the simultaneous optimization of the contact forces, contact timings, continuous motion trajectories and compliance, while satisfying task constraints. We compare our method against standard compliance control and an impact-agnostic TO method in physical simulations. Further, we experimentally validate the proposed method with a robot manipulator on the task of halting a large-momentum object.

CVMar 25, 2019

Deep Shape from Polarization

Yunhao Ba, Alex Ross Gilbert, Franklin Wang et al.

This paper makes a first attempt to bring the Shape from Polarization (SfP) problem to the realm of deep learning. The previous state-of-the-art methods for SfP have been purely physics-based. We see value in these principled models, and blend these physical models as priors into a neural network architecture. This proposed approach achieves results that exceed the previous state-of-the-art on a challenging dataset we introduce. This dataset consists of polarization images taken over a range of object textures, paints, and lighting conditions. We report that our proposed method achieves the lowest test error on each tested condition in our dataset, showing the value of blending data-driven and physics-driven approaches.