Real-time tree search with pessimistic scenarios
This addresses the challenge of real-time decision-making for autonomous agents in critical situations like collision avoidance, though it is incremental as it builds on existing tree search methods.
The paper tackles the problem of real-time decision-making for autonomous agents in partially observable, multi-agent environments by proposing a tree search technique that uses a deterministic and pessimistic scenario beyond a certain depth, enabling consideration of distant future events. The result was demonstrated in the Pommerman competition, where agents using this technique achieved first and third places.
Autonomous agents need to make decisions in a sequential manner, under partially observable environment, and in consideration of how other agents behave. In critical situations, such decisions need to be made in real time for example to avoid collisions and recover to safe conditions. We propose a technique of tree search where a deterministic and pessimistic scenario is used after a specified depth. Because there is no branching with the deterministic scenario, the proposed technique allows us to take into account the events that can occur far ahead in the future. The effectiveness of the proposed technique is demonstrated in Pommerman, a multi-agent environment used in a NeurIPS 2018 competition, where the agents that implement the proposed technique have won the first and third places.