Reinforcement Learning with A* and a Deep Heuristic
This work addresses the limitation of A* in domains without known heuristics, enabling its use in scenarios like pixel-based driving simulations, though it is incremental as it builds on existing methods combining neural networks and trees.
The paper tackles the problem of applying A* to domains lacking a good heuristic by training a deep neural network as the heuristic and combining it with A*, resulting in significantly better performance than N-Step Deep Q-Learning in a driving simulation with pixel-based input.
A* is a popular path-finding algorithm, but it can only be applied to those domains where a good heuristic function is known. Inspired by recent methods combining Deep Neural Networks (DNNs) and trees, this study demonstrates how to train a heuristic represented by a DNN and combine it with A*. This new algorithm which we call aleph-star can be used efficiently in domains where the input to the heuristic could be processed by a neural network. We compare aleph-star to N-Step Deep Q-Learning (DQN Mnih et al. 2013) in a driving simulation with pixel-based input, and demonstrate significantly better performance in this scenario.