ROAug 27, 2024
Benchmarking Reinforcement Learning Methods for Dexterous Robotic Manipulation with a Three-Fingered GripperElizabeth Cutler, Yuning Xing, Tony Cui et al.
Reinforcement Learning (RL) training is predominantly conducted in cost-effective and controlled simulation environments. However, the transfer of these trained models to real-world tasks often presents unavoidable challenges. This research explores the direct training of RL algorithms in controlled yet realistic real-world settings for the execution of dexterous manipulation. The benchmarking results of three RL algorithms trained on intricate in-hand manipulation tasks within practical real-world contexts are presented. Our study not only demonstrates the practicality of RL training in authentic real-world scenarios, facilitating direct real-world applications, but also provides insights into the associated challenges and considerations. Additionally, our experiences with the employed experimental methods are shared, with the aim of empowering and engaging fellow researchers and practitioners in this dynamic field of robotics.
AIJul 31, 2024
Image-Based Deep Reinforcement Learning with Intrinsically Motivated Stimuli: On the Execution of Complex Robotic TasksDavid Valencia, Henry Williams, Yuning Xing et al.
Reinforcement Learning (RL) has been widely used to solve tasks where the environment consistently provides a dense reward value. However, in real-world scenarios, rewards can often be poorly defined or sparse. Auxiliary signals are indispensable for discovering efficient exploration strategies and aiding the learning process. In this work, inspired by intrinsic motivation theory, we postulate that the intrinsic stimuli of novelty and surprise can assist in improving exploration in complex, sparsely rewarded environments. We introduce a novel sample-efficient method able to learn directly from pixels, an image-based extension of TD3 with an autoencoder called \textit{NaSA-TD3}. The experiments demonstrate that NaSA-TD3 is easy to train and an efficient method for tackling complex continuous-control robotic tasks, both in simulated environments and real-world settings. NaSA-TD3 outperforms existing state-of-the-art RL image-based methods in terms of final performance without requiring pre-trained models or human demonstrations.
ROAug 24, 2023
Racing Towards Reinforcement Learning based control of an Autonomous Formula SAE CarAakaash Salvaji, Harry Taylor, David Valencia et al.
With the rising popularity of autonomous navigation research, Formula Student (FS) events are introducing a Driverless Vehicle (DV) category to their event list. This paper presents the initial investigation into utilising Deep Reinforcement Learning (RL) for end-to-end control of an autonomous FS race car for these competitions. We train two state-of-the-art RL algorithms in simulation on tracks analogous to the full-scale design on a Turtlebot2 platform. The results demonstrate that our approach can successfully learn to race in simulation and then transfer to a real-world racetrack on the physical platform. Finally, we provide insights into the limitations of the presented approach and guidance into the future directions for applying RL toward full-scale autonomous FS racing.
LGMay 4, 2024
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple CriticsDavid Valencia, Henry Williams, Yuning Xing et al.
Categorical Distributional Reinforcement Learning (CDRL) has demonstrated superior sample efficiency in learning complex tasks compared to conventional Reinforcement Learning (RL) approaches. However, the practical application of CDRL is encumbered by challenging projection steps, detailed parameter tuning, and domain knowledge. This paper addresses these challenges by introducing a pioneering Continuous Distributional Model-Free RL algorithm tailored for continuous action spaces. The proposed algorithm simplifies the implementation of distributional RL, adopting an actor-critic architecture wherein the critic outputs a continuous probability distribution. Additionally, we propose an ensemble of multiple critics fused through a Kalman fusion mechanism to mitigate overestimation bias. Through a series of experiments, we validate that our proposed method provides a sample-efficient solution for executing complex continuous-control tasks.
LGDec 9, 2024
Bounded Exploration with World Model Uncertainty in Soft Actor-Critic Reinforcement Learning AlgorithmTing Qiao, Henry Williams, David Valencia et al.
One of the bottlenecks preventing Deep Reinforcement Learning algorithms (DRL) from real-world applications is how to explore the environment and collect informative transitions efficiently. The present paper describes bounded exploration, a novel exploration method that integrates both 'soft' and intrinsic motivation exploration. Bounded exploration notably improved the Soft Actor-Critic algorithm's performance and its model-based extension's converging speed. It achieved the highest score in 6 out of 8 experiments. Bounded exploration presents an alternative method to introduce intrinsic motivations to exploration when the original reward function has strict meanings.
SEMay 10, 2021
Trials and Tribulations of Developing Hybrid Quantum-Classical Microservices SystemsJavier Rojo, David Valencia, Javier Berrocal et al.
Quantum computing holds great promise to solve to problems where classical computers cannot reach. To the point where it already arouses the interest of both scientific and industrial communities. Thus, it is expected that hybrid systems will start to appear where quantum software interacts with classical systems. Such coexistence can be fostered by service computing. Unfortunately, the way in which quantum code can be offered as a service still misses out on many of the potential benefits of service computing. This paper takes the traveling salesman problem, and tackles the challenge of giving it an implementation in the form of a quantum microservice. Then it is used to detect which of the benefits of service computing are lost in the process. The conclusions help to measure the distance between the current state of technology and the state that would be desirable in order to have a real quantum service engineering.