Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs
This addresses scalability issues for researchers and practitioners in multiagent systems, though it is incremental as it builds on existing I-POMDP frameworks.
The paper tackles the scalability problem of interactive partially observable Markov decision processes (I-POMDPs) in multiagent settings by modeling and extending anonymity and context-specific independence, achieving computational efficiency demonstrated by solving a problem with over 1,000 agents.
Interactive partially observable Markov decision processes (I-POMDP) provide a formal framework for planning for a self-interested agent in multiagent settings. An agent operating in a multiagent environment must deliberate about the actions that other agents may take and the effect these actions have on the environment and the rewards it receives. Traditional I-POMDPs model this dependence on the actions of other agents using joint action and model spaces. Therefore, the solution complexity grows exponentially with the number of agents thereby complicating scalability. In this paper, we model and extend anonymity and context-specific independence -- problem structures often present in agent populations -- for computational gain. We empirically demonstrate the efficiency from exploiting these problem structures by solving a new multiagent problem involving more than 1,000 agents.