An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments
This addresses the challenge of learning policies in complex, partially observable multi-view settings for applications like autonomous driving, though it is incremental as it builds on existing attention and reinforcement learning methods.
The paper tackles the problem of partial observability in multi-view environments for reinforcement learning by proposing an attention mechanism that dynamically weights views to generate a single feature representation, resulting in outperforming state-of-the-art baselines on TORCS and other 3D environments.
In reinforcement learning algorithms, leveraging multiple views of the environment can improve the learning of complicated policies. In multi-view environments, due to the fact that the views may frequently suffer from partial observability, their level of importance are often different. In this paper, we propose a deep reinforcement learning method and an attention mechanism in a multi-view environment. Each view can provide various representative information about the environment. Through our attention mechanism, our method generates a single feature representation of environment given its multiple views. It learns a policy to dynamically attend to each view based on its importance in the decision-making process. Through experiments, we show that our method outperforms its state-of-the-art baselines on TORCS racing car simulator and three other complex 3D environments with obstacles. We also provide experimental results to evaluate the performance of our method on noisy conditions and partial observation settings.