Gated Path Planning Networks
This work addresses training challenges for differentiable path planning modules used by navigation agents, offering an incremental improvement over existing methods.
The paper tackled the optimization problems of Value Iteration Networks (VINs) in path planning, such as training instability and sensitivity, by reframing them as recurrent-convolutional networks and introducing gated recurrent updates, resulting in a new architecture that outperformed VIN on metrics like learning speed and generalization across various environments.
Value Iteration Networks (VINs) are effective differentiable path planning modules that can be used by agents to perform navigation while still maintaining end-to-end differentiability of the entire architecture. Despite their effectiveness, they suffer from several disadvantages including training instability, random seed sensitivity, and other optimization problems. In this work, we reframe VINs as recurrent-convolutional networks which demonstrates that VINs couple recurrent convolutions with an unconventional max-pooling activation. From this perspective, we argue that standard gated recurrent update equations could potentially alleviate the optimization issues plaguing VIN. The resulting architecture, which we call the Gated Path Planning Network, is shown to empirically outperform VIN on a variety of metrics such as learning speed, hyperparameter sensitivity, iteration count, and even generalization. Furthermore, we show that this performance gap is consistent across different maze transition types, maze sizes and even show success on a challenging 3D environment, where the planner is only provided with first-person RGB images.