LGSep 6, 2021

Error Controlled Actor-Critic

Xingen Gao, Fei Chao, Changle Zhou, Zhen Ge, Chih-Min Lin, Longzhi Yang, Xiang Chang, Changjing Shang

arXiv:2109.02517v21.6Has Code

Originality Incremental advance

AI Analysis

This addresses a key bottleneck in reinforcement learning for continuous control, though it appears incremental as it builds on existing actor-critic frameworks.

The paper tackles the problem of approximation error in value functions causing overestimation and hindering convergence in actor-critic methods by proposing Error Controlled Actor-Critic, which confines this error and significantly outperforms other model-free RL algorithms on continuous control tasks.

On error of value function inevitably causes an overestimation phenomenon and has a negative impact on the convergence of the algorithms. To mitigate the negative effects of the approximation error, we propose Error Controlled Actor-critic which ensures confining the approximation error in value function. We present an analysis of how the approximation error can hinder the optimization process of actor-critic methods.Then, we derive an upper boundary of the approximation error of Q function approximator and find that the error can be lowered by restricting on the KL-divergence between every two consecutive policies when training the policy. The results of experiments on a range of continuous control tasks demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.

View on arXiv PDF Code

Similar