RO SYNov 27, 2021

Reinforcement Learning-based Switching Controller for a Milliscale Robot in a Constrained Environment

Abbas Tariverdi, Ulysse Côté-Allard, Kim Mathiassen, Ole J. Elle, Håvard Kalvøy, Ørjan G. Martinsen, Jim Tørresen

arXiv:2111.13969v23.0

Originality Incremental advance

AI Analysis

This addresses navigation challenges for medical or robotic applications like capsule endoscopy in hazardous environments, representing an incremental improvement with robust deployment.

The paper tackled the problem of autonomously moving a milliscale robot through constrained environments with disturbances, using a reinforcement learning-based switching controller that achieved a 98.86% success rate in real-world tests.

This work presents a reinforcement learning-based switching control mechanism to autonomously move a ferromagnetic object (representing a milliscale robot) around obstacles within a constrained environment in the presence of disturbances. This mechanism can be used to navigate objects (e.g., capsule endoscopy, swarms of drug particles) through complex environments when active control is a necessity but where direct manipulation can be hazardous. The proposed control scheme consists of a switching control architecture implemented by two sub-controllers. The first sub-controller is designed to employ the robot's inverse kinematic solutions to do an environment search for the to-be-carried ferromagnetic particle while being robust to disturbances. The second sub-controller uses a customized rainbow algorithm to control a robotic arm, i.e., the UR5 robot, to carry a ferromagnetic particle to a desired position through a constrained environment. For the customized Rainbow algorithm, Quantile Huber loss from the Implicit Quantile Networks (IQN) algorithm and ResNet are employed. The proposed controller is first trained and tested in a real-time physics simulation engine (PyBullet). Afterward, the trained controller is transferred to a UR5 robot to remotely transport a ferromagnetic particle in a real-world scenario, achieving a 98.86% success rate over 30 episodes for randomly generated trajectories, demonstrating the viability of the proposed approach for real-life applications. In addition, two classical pathfinding approaches, Attractor Dynamics and the execution extended Rapidly-Exploring Random Trees (ERRT), are also investigated and compared to the RL-based method. The proposed RL-based algorithm is shown to achieve performance comparable to that of the tested classical path planners whilst being more robust to deploy in dynamical environments.

View on arXiv PDF

Similar