LGJan 24, 2022

State-Conditioned Adversarial Subgoal Generation

Vivienne Huiling Wang, Joni Pajarinen, Tinghuai Wang, Joni-Kristian Kämäräinen

arXiv:2201.09635v410.417 citations

Originality Incremental advance

AI Analysis

This addresses a key bottleneck in off-policy hierarchical reinforcement learning for robotics and control applications, though it appears incremental as it builds on existing adversarial methods.

The paper tackles the problem of non-stationary high-level policies in hierarchical reinforcement learning by proposing a state-conditioned adversarial approach to generate subgoals compatible with the low-level policy, resulting in improved learning efficiency and performance in continuous control tasks.

Hierarchical reinforcement learning (HRL) proposes to solve difficult tasks by performing decision-making and control at successively higher levels of temporal abstraction. However, off-policy HRL often suffers from the problem of a non-stationary high-level policy since the low-level policy is constantly changing. In this paper, we propose a novel HRL approach for mitigating the non-stationarity by adversarially enforcing the high-level policy to generate subgoals compatible with the current instantiation of the low-level policy. In practice, the adversarial learning is implemented by training a simple state-conditioned discriminator network concurrently with the high-level policy which determines the compatibility level of subgoals. Comparison to state-of-the-art algorithms shows that our approach improves both learning efficiency and performance in challenging continuous control tasks.

View on arXiv PDF

Similar