Application of LLM Guided Reinforcement Learning in Formation Control with Collision Avoidance
This work addresses a specific bottleneck in multi-agent systems for robotics or autonomous vehicle applications, representing an incremental improvement over existing MARL methods.
The paper tackles the challenge of designing effective reward functions for Multi-Agent Reinforcement Learning (MARL) in Formation Control with Collision Avoidance (FCCA) by introducing a novel framework that uses large language models (LLMs) to generate dynamically adjustable reward functions, resulting in enhanced efficiency with fewer iterations to reach superior performance levels in dynamic environments.
Multi-Agent Systems (MAS) excel at accomplishing complex objectives through the collaborative efforts of individual agents. Among the methodologies employed in MAS, Multi-Agent Reinforcement Learning (MARL) stands out as one of the most efficacious algorithms. However, when confronted with the complex objective of Formation Control with Collision Avoidance (FCCA): designing an effective reward function that facilitates swift convergence of the policy network to an optimal solution. In this paper, we introduce a novel framework that aims to overcome this challenge. By giving large language models (LLMs) on the prioritization of tasks and the observable information available to each agent, our framework generates reward functions that can be dynamically adjusted online based on evaluation outcomes by employing more advanced evaluation metrics rather than the rewards themselves. This mechanism enables the MAS to simultaneously achieve formation control and obstacle avoidance in dynamic environments with enhanced efficiency, requiring fewer iterations to reach superior performance levels. Our empirical studies, conducted in both simulation and real-world settings, validate the practicality and effectiveness of our proposed approach.