MA LG OCFeb 7, 2025

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency

Qixin Zhang, Zongqi Wan, Yu Yang, Li Shen, Dacheng Tao

arXiv:2502.05028v15.95 citationsh-index: 34ICLR

Originality Incremental advance

AI Analysis

This work addresses coordination challenges in multi-agent systems for applications like robot planning, offering improved approximation guarantees and reduced communication requirements, though it is incremental over existing methods.

The paper tackles the problem of coordinating multiple agents to maximize submodular functions in unpredictable environments, presenting two algorithms (MA-OSMA and MA-OSEA) that achieve a regret bound of Õ(√(C_T T/(1-β))) with a (1-e^{-c})/c-approximation, improving the prior 1/(1+c)-approximation from OSG.

Coordinating multiple agents to collaboratively maximize submodular functions in unpredictable environments is a critical task with numerous applications in machine learning, robot planning and control. The existing approaches, such as the OSG algorithm, are often hindered by their poor approximation guarantees and the rigid requirement for a fully connected communication graph. To address these challenges, we firstly present a $\textbf{MA-OSMA}$ algorithm, which employs the multi-linear extension to transfer the discrete submodular maximization problem into a continuous optimization, thereby allowing us to reduce the strict dependence on a complete graph through consensus techniques. Moreover, $\textbf{MA-OSMA}$ leverages a novel surrogate gradient to avoid sub-optimal stationary points. To eliminate the computationally intensive projection operations in $\textbf{MA-OSMA}$, we also introduce a projection-free $\textbf{MA-OSEA}$ algorithm, which effectively utilizes the KL divergence by mixing a uniform distribution. Theoretically, we confirm that both algorithms achieve a regret bound of $\widetilde{O}(\sqrt{\frac{C_{T}T}{1-β}})$ against a $(\frac{1-e^{-c}}{c})$-approximation to the best comparator in hindsight, where $C_{T}$ is the deviation of maximizer sequence, $β$ is the spectral gap of the network and $c$ is the joint curvature of submodular objectives. This result significantly improves the $(\frac{1}{1+c})$-approximation provided by the state-of-the-art OSG algorithm. Finally, we demonstrate the effectiveness of our proposed algorithms through simulation-based multi-target tracking.

View on arXiv PDF

Similar