Bifeng Song

RO
h-index6
4papers
97citations
Novelty51%
AI Score28

4 Papers

AIMay 22, 2024
ConcertoRL: An Innovative Time-Interleaved Reinforcement Learning Approach for Enhanced Control in Direct-Drive Tandem-Wing Vehicles

Minghao Zhang, Bifeng Song, Changhao Chen et al.

In control problems for insect-scale direct-drive experimental platforms under tandem wing influence, the primary challenge facing existing reinforcement learning models is their limited safety in the exploration process and the stability of the continuous training process. We introduce the ConcertoRL algorithm to enhance control precision and stabilize the online training process, which consists of two main innovations: a time-interleaved mechanism to interweave classical controllers with reinforcement learning-based controllers aiming to improve control precision in the initial stages, a policy composer organizes the experience gained from previous learning to ensure the stability of the online training process. This paper conducts a series of experiments. First, experiments incorporating the time-interleaved mechanism demonstrate a substantial performance boost of approximately 70% over scenarios without reinforcement learning enhancements and a 50% increase in efficiency compared to reference controllers with doubled control frequencies. These results highlight the algorithm's ability to create a synergistic effect that exceeds the sum of its parts.

RONov 3, 2021
A Novel Actuation Strategy for an Agile Bio-inspired FWAV Performing a Morphing-coupled Wingbeat Pattern

Ang Chen, Bifeng Song, Zhihe Wang et al.

Flying vertebrates exhibit sophisticated wingbeat kinematics. Their specialized forelimbs allow for the wing morphing motion to couple with the flapping motion during their level flight, Previous flyable bionic platforms have successfully applied bio-inspired wing morphing but cannot yet be propelled by the morphing-coupled wingbeat pattern. Spurred by this, we develop a bio-inspired flapping-wing aerial vehicle (FWAV) entitled RoboFalcon, which is equipped with a novel mechanism to drive the bat-style morphing wings, performs a morphing-coupled wingbeat pattern, and overall manages an appealing flight. The novel mechanism of RoboFalcon allows coupling the morphing and flapping during level flight and decoupling these when maneuvering is required, producing a bilateral asymmetric downstroke affording high rolling agility. The bat-style morphing wing is designed with a tilted mounting angle around the radius at the wrist joint to mimic the wrist supination and pronation effect of flying vertebrates' forelimbs. The agility of RoboFalcon is assessed through several rolling maneuver flight tests, and we demonstrate its well-performing agility capability compared to flying creatures and current flapping-wing platforms. Wind tunnel tests indicate that the roll moment of the asymmetric downstroke is correlated with the flapping frequency, and the wrist mounting angle can be used for tuning the angle of attack and lift-thrust configuration of the equilibrium flight state. We believe that this work yields a well-performing bionic platform and provides a new actuation strategy for the morphing-coupled flapping flight.

ROSep 30, 2020
Explainable Deep Reinforcement Learning for UAV Autonomous Navigation

Lei He, Aouf Nabil, Bifeng Song

Autonomous navigation in unknown complex environment is still a hard problem, especially for small Unmanned Aerial Vehicles (UAVs) with limited computation resources. In this paper, a neural network-based reactive controller is proposed for a quadrotor to fly autonomously in unknown outdoor environment. The navigation controller makes use of only current sensor data to generate the control signal without any optimization or configuration space searching, which reduces both memory and computation requirement. The navigation problem is modelled as a Markov Decision Process (MDP) and solved using deep reinforcement learning (DRL) method. Specifically, to get better understanding of the trained network, some model explanation methods are proposed. Based on the feature attribution, each decision making result during flight is explained using both visual and texture explanation. Moreover, some global analysis are also provided for experts to evaluate and improve the trained neural network. The simulation results illustrated the proposed method can make useful and reasonable explanation for the trained model, which is beneficial for both non-expert users and controller designer. Finally, the real world tests shown the proposed controller can navigate the quadrotor to goal position successfully and the reactive controller performs much faster than some conventional approach under the same computation resource.

ROAug 6, 2020
Deep Reinforcement Learning based Local Planner for UAV Obstacle Avoidance using Demonstration Data

Lei He, Nabil Aouf, James F. Whidborne et al.

In this paper, a deep reinforcement learning (DRL) method is proposed to address the problem of UAV navigation in an unknown environment. However, DRL algorithms are limited by the data efficiency problem as they typically require a huge amount of data before they reach a reasonable performance. To speed up the DRL training process, we developed a novel learning framework which combines imitation learning and reinforcement learning and building upon Twin Delayed DDPG (TD3) algorithm. We newly introduced both policy and Q-value network are learned using the expert demonstration during the imitation phase. To tackle the distribution mismatch problem transfer from imitation to reinforcement learning, both TD-error and decayed imitation loss are used to update the pre-trained network when start interacting with the environment. The performances of the proposed algorithm are demonstrated on the challenging 3D UAV navigation problem using depth cameras and sketched in a variety of simulation environments.