Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks
This work addresses control challenges for robotic manipulators, but it is incremental as it compares existing algorithms without introducing new methods.
The study tackled the problem of controlling robotic manipulators by comparing three off-policy deep reinforcement learning algorithms (DDPG, TD3, SAC) on four tasks in a MuJoCo simulation, analyzing their efficiency and speed without specifying concrete numerical results.
In order to avoid conventional controlling methods which created obstacles due to the complexity of systems and intense demand on data density, developing modern and more efficient control methods are required. In this way, reinforcement learning off-policy and model-free algorithms help to avoid working with complex models. In terms of speed and accuracy, they become prominent methods because the algorithms use their past experience to learn the optimal policies. In this study, three reinforcement learning algorithms; DDPG, TD3 and SAC have been used to train Fetch robotic manipulator for four different tasks in MuJoCo simulation environment. All of these algorithms are off-policy and able to achieve their desired target by optimizing both policy and value functions. In the current study, the efficiency and the speed of these three algorithms are analyzed in a controlled environment.