AIFeb 18, 2014

Off-Policy General Value Functions to Represent Dynamic Role Assignments in RoboCup 3D Soccer Simulation

Saminda Abeyruwan, Andreas Seekircher, Ubbo Visser

arXiv:1402.4525v12 citations

Originality Synthesis-oriented

AI Analysis

This addresses the problem of efficient multi-agent coordination in dynamic, adversarial environments for robotics and simulation researchers, though it is incremental as it applies existing reinforcement learning methods to a specific domain.

The paper tackled the challenge of learning accurate world knowledge in real-time for dynamic role assignments in the RoboCup 3D Soccer Simulation, using Off-Policy Gradient Descent algorithms, and achieved competitive policies against top teams from the 2012 competitions in various agent configurations.

Collecting and maintaining accurate world knowledge in a dynamic, complex, adversarial, and stochastic environment such as the RoboCup 3D Soccer Simulation is a challenging task. Knowledge should be learned in real-time with time constraints. We use recently introduced Off-Policy Gradient Descent algorithms within Reinforcement Learning that illustrate learnable knowledge representations for dynamic role assignments. The results show that the agents have learned competitive policies against the top teams from the RoboCup 2012 competitions for three vs three, five vs five, and seven vs seven agents. We have explicitly used subsets of agents to identify the dynamics and the semantics for which the agents learn to maximize their performance measures, and to gather knowledge about different objectives, so that all agents participate effectively and efficiently within the group.

View on arXiv PDF

Similar