Team Behavior in Interactive Dynamic Influence Diagrams with Applications to Ad Hoc Teams
This work addresses the problem of enabling effective cooperation in multiagent systems for applications like robotics or gaming, though it appears incremental as it builds on existing frameworks.
The paper tackles the challenge of planning for ad hoc teamwork, where agents collaborate without prior coordination, by addressing limitations in finitely-nested modeling that prevent optimal solutions. It demonstrates that integrating learning into planning within interactive dynamic influence diagrams facilitates optimal team behavior.
Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of individual decision making frameworks. However, individual decision making in multiagent settings faces the task of having to reason about other agents' actions, which in turn involves reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. We show that a consequence of the finitely-nested modeling is that we may not obtain optimal team solutions in cooperative settings. We address this limitation by including models at level 0 whose solutions involve learning. We demonstrate that the learning integrated into planning in the context of interactive dynamic influence diagrams facilitates optimal team behavior, and is applicable to ad hoc teamwork.