LG AI RO SYFeb 16, 2021

Steadily Learn to Drive with Virtual Memory

Yuhang Zhang, Yao Mu, Yujie Yang, Yang Guan, Shengbo Eben Li, Qi Sun, Jianyu Chen

arXiv:2102.08072v13.12 citations

Originality Incremental advance

AI Analysis

This work addresses data inefficiency and instability in RL for autonomous driving, offering a domain-specific improvement that is incremental in nature.

The paper tackles the problems of low data efficiency and training oscillation in reinforcement learning for high-dimensional autonomous driving tasks by proposing the LVM algorithm, which compresses high-dimensional information into latent states and uses a latent dynamic model with virtual memory to improve learning; LVM demonstrated superior data efficiency, stability, and control performance compared to existing methods in an image-input driving task.

Reinforcement learning has shown great potential in developing high-level autonomous driving. However, for high-dimensional tasks, current RL methods suffer from low data efficiency and oscillation in the training process. This paper proposes an algorithm called Learn to drive with Virtual Memory (LVM) to overcome these problems. LVM compresses the high-dimensional information into compact latent states and learns a latent dynamic model to summarize the agent's experience. Various imagined latent trajectories are generated as virtual memory by the latent dynamic model. The policy is learned by propagating gradient through the learned latent model with the imagined latent trajectories and thus leads to high data efficiency. Furthermore, a double critic structure is designed to reduce the oscillation during the training process. The effectiveness of LVM is demonstrated by an image-input autonomous driving task, in which LVM outperforms the existing method in terms of data efficiency, learning stability, and control performance.

View on arXiv PDF

Similar