Distribution over Beliefs for Memory Bounded Dec-POMDP Planning
This addresses planning in decentralized partially observable Markov decision processes, which is a domain-specific problem for multi-agent systems, and appears incremental as it builds on existing point-based methods with a novel heuristic.
The paper tackles approximate planning in Dec-POMDPs by proposing a point-based method that uses heuristic belief probability estimation to select a bounded number of policy trees, formulated as a combinatorial optimization problem to minimize pruning error, and it outperforms state-of-the-art approaches in solution quality.
We propose a new point-based method for approximate planning in Dec-POMDP which outperforms the state-of-the-art approaches in terms of solution quality. It uses a heuristic estimation of the prior probability of beliefs to choose a bounded number of policy trees: this choice is formulated as a combinatorial optimisation problem minimising the error induced by pruning.