AINov 11, 2025

Neural Value Iteration

arXiv:2511.08825v1h-index: 11
Originality Incremental advance
AI Analysis

This addresses scalability issues in POMDP planning for robotics and AI applications, though it builds incrementally on point-based value iteration.

The authors tackled the computational intractability of solving large-scale POMDPs by representing the value function as neural networks instead of α-vectors, achieving near-optimal solutions in problems where existing solvers fail.

The value function of a POMDP exhibits the piecewise-linear-convex (PWLC) property and can be represented as a finite set of hyperplanes, known as $α$-vectors. Most state-of-the-art POMDP solvers (offline planners) follow the point-based value iteration scheme, which performs Bellman backups on $α$-vectors at reachable belief points until convergence. However, since each $α$-vector is $|S|$-dimensional, these methods quickly become intractable for large-scale problems due to the prohibitive computational cost of Bellman backups. In this work, we demonstrate that the PWLC property allows a POMDP's value function to be alternatively represented as a finite set of neural networks. This insight enables a novel POMDP planning algorithm called \emph{Neural Value Iteration}, which combines the generalization capability of neural networks with the classical value iteration framework. Our approach achieves near-optimal solutions even in extremely large POMDPs that are intractable for existing offline solvers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes