LG AIJan 23, 2025

Concurrent Learning with Aggregated States via Randomized Least Squares Value Iteration

Yan Chen, Qinxun Bai, Yiteng Zhang, Shi Dong, Maria Dimakopoulou, Qi Sun, Zhengyuan Zhou

arXiv:2501.13394v34.1h-index: 13

Originality Incremental advance

AI Analysis

This work addresses a theoretical gap in concurrent learning for multi-agent reinforcement learning, showing incremental improvements in space complexity.

The paper tackles the problem of whether randomization can help multiple agents explore concurrently in reinforcement learning, demonstrating polynomial worst-case regret bounds with per-agent regret decreasing at an optimal rate of Θ(1/√N) in finite- and infinite-horizon environments.

Designing learning agents that explore efficiently in a complex environment has been widely recognized as a fundamental challenge in reinforcement learning. While a number of works have demonstrated the effectiveness of techniques based on randomized value functions on a single agent, it remains unclear, from a theoretical point of view, whether injecting randomization can help a society of agents {\it concurently} explore an environment. The theoretical results %that we established in this work tender an affirmative answer to this question. We adapt the concurrent learning framework to \textit{randomized least-squares value iteration} (RLSVI) with \textit{aggregated state representation}. We demonstrate polynomial worst-case regret bounds in both finite- and infinite-horizon environments. In both setups the per-agent regret decreases at an optimal rate of $Θ\left(\frac{1}{\sqrt{N}}\right)$, highlighting the advantage of concurent learning. Our algorithm exhibits significantly lower space complexity compared to \cite{russo2019worst} and \cite{agrawal2021improved}. We reduce the space complexity by a factor of $K$ while incurring only a $\sqrt{K}$ increase in the worst-case regret bound, compared to \citep{agrawal2021improved,russo2019worst}. Additionally, we conduct numerical experiments to demonstrate our theoretical findings.

View on arXiv PDF

Similar