Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
This addresses the challenge of generalization in robotic navigation for robots operating in unknown, cluttered environments, though it is incremental as it builds on existing DRL methods.
The paper tackles the problem of local minima in deep reinforcement learning for robotic navigation by proposing a novel reward function that incorporates map information, combined with the SAC algorithm, which outperforms existing methods in avoiding local minima and collisions in sim-to-sim and sim-to-real experiments.
In this paper, we study the application of DRL algorithms in the context of local navigation problems, in which a robot moves towards a goal location in unknown and cluttered workspaces equipped only with limited-range exteroceptive sensors, such as LiDAR. Collision avoidance policies based on DRL present some advantages, but they are quite susceptible to local minima, once their capacity to learn suitable actions is limited to the sensor range. Since most robots perform tasks in unstructured environments, it is of great interest to seek generalized local navigation policies capable of avoiding local minima, especially in untrained scenarios. To do so, we propose a novel reward function that incorporates map information gained in the training stage, increasing the agent's capacity to deliberate about the best course of action. Also, we use the SAC algorithm for training our ANN, which shows to be more effective than others in the state-of-the-art literature. A set of sim-to-sim and sim-to-real experiments illustrate that our proposed reward combined with the SAC outperforms the compared methods in terms of local minima and collision avoidance.