ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling
This work addresses scalability and efficiency issues in large multi-agent systems, though it appears incremental as it builds on existing decentralized learning approaches.
The paper tackles the problem of multi-agent collaboration by addressing challenges in identifying optimal objectives and aligning them among agents, introducing a decentralized state-based value learning algorithm and a novel interaction mechanism, with empirical results showing it outperforms existing decentralized strategies.
Effective multi-agent collaboration is imperative for solving complex, distributed problems. In this context, two key challenges must be addressed: first, autonomously identifying optimal objectives for collective outcomes; second, aligning these objectives among agents. Traditional frameworks, often reliant on centralized learning, struggle with scalability and efficiency in large multi-agent systems. To overcome these issues, we introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states. Furthermore, we introduce a novel mechanism for multi-agent interaction, wherein less proficient agents follow and adopt policies from more experienced ones, thereby indirectly guiding their learning process. Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy. Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies by effectively identifying and aligning optimal objectives.