LGAIMay 11, 2023

Value Iteration Networks with Gated Summarization Module

arXiv:2305.07039v21 citations
Originality Incremental advance
AI Analysis

This addresses planning efficiency and stability issues in reinforcement learning for researchers and practitioners, representing an incremental improvement over existing VIN methods.

The paper tackles the challenges of Value Iteration Networks (VIN) with larger input maps and error accumulation by proposing GS-VIN, which uses an adaptive iteration strategy and gated summarization module, achieving improved accuracy, success rates, and performance in grid world and Atari experiments.

In this paper, we address the challenges faced by Value Iteration Networks (VIN) in handling larger input maps and mitigating the impact of accumulated errors caused by increased iterations. We propose a novel approach, Value Iteration Networks with Gated Summarization Module (GS-VIN), which incorporates two main improvements: (1) employing an Adaptive Iteration Strategy in the Value Iteration module to reduce the number of iterations, and (2) introducing a Gated Summarization module to summarize the iterative process. The adaptive iteration strategy uses larger convolution kernels with fewer iteration times, reducing network depth and increasing training stability while maintaining the accuracy of the planning process. The gated summarization module enables the network to emphasize the entire planning process, rather than solely relying on the final global planning outcome, by temporally and spatially resampling the entire planning process within the VI module. We conduct experiments on 2D grid world path-finding problems and the Atari Mr. Pac-man environment, demonstrating that GS-VIN outperforms the baseline in terms of single-step accuracy, planning success rate, and overall performance across different map sizes. Additionally, we provide an analysis of the relationship between input size, kernel size, and the number of iterations in VI-based models, which is applicable to a majority of VI-based models and offers valuable insights for researchers and industrial deployment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes