Reward Shaping with Subgoals for Social Navigation
This addresses the challenge of training robots for real-world social navigation, though it appears incremental as it builds on existing reinforcement learning approaches.
The paper tackles the problem of slow reinforcement learning in social navigation tasks where robots must navigate around unpredictable humans. Their proposed reward shaping method with subgoals improved learning efficiency compared to a base algorithm.
Social navigation has been gaining attentions with the growth in machine intelligence. Since reinforcement learning can select an action in the prediction phase at a low computational cost, it has been formulated in a social navigation tasks. However, reinforcement learning takes an enormous number of iterations until acquiring a behavior policy in the learning phase. This negatively affects the learning of robot behaviors in the real world. In particular, social navigation includes humans who are unpredictable moving obstacles in an environment. We proposed a reward shaping method with subgoals to accelerate learning. The main part is an aggregation method that use subgoals to shape a reinforcement learning algorithm. We performed a learning experiment with a social navigation task in which a robot avoided collisions and then reached its goal. The experimental results show that our method improved the learning efficiency from a base algorithm in the task.