ROApr 13, 2018

Smooth and Efficient Policy Exploration for Robot Trajectory Learning

arXiv:1804.04903v33 citations
Originality Synthesis-oriented
AI Analysis

This addresses hyperparameter tuning challenges in robot trajectory learning, though it appears incremental as it combines existing methods.

The paper tackles the problem of manual hyperparameter tuning in robot policy search algorithms by proposing an adaptive exploration rate learning model that combines existing methods. The results show the method achieves the same learning outcome as previous approaches on the ball-in-cup problem but with fewer iterations.

Many policy search algorithms have been proposed for robot learning and proved to be practical in real robot applications. However, there are still hyperparameters in the algorithms, such as the exploration rate, which requires manual tuning. The existing methods to design the exploration rate manually or automatically may not be general enough or hard to apply in the real robot. In this paper, we propose a learning model to update the exploration rate adaptively. The overall algorithm is a combination of methods proposed by other researchers. Smooth trajectories for the robot can be produced by the algorithm and the updated exploration rate maximizes the lower bound of the expected return. Our method is tested in the ball-in-cup problem. The results show that our method can receive the same learning outcome as the previous methods but with fewer iterations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes