LG AISep 28, 2023

Multi-Bellman operator for convergence of $Q$-learning with linear function approximation

Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo

arXiv:2309.16819v15.34 citationsh-index: 27

Originality Incremental advance

AI Analysis

This addresses a fundamental problem in reinforcement learning for researchers and practitioners, offering incremental improvements in convergence guarantees.

The paper tackles the convergence issue of Q-learning with linear function approximation by introducing a novel multi-Bellman operator, which provides improved fixed-point guarantees and leads to a new algorithm that converges to solutions of arbitrary accuracy.

We study the convergence of $Q$-learning with linear function approximation. Our key contribution is the introduction of a novel multi-Bellman operator that extends the traditional Bellman operator. By exploring the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes contractive, providing improved fixed-point guarantees compared to the Bellman operator. To leverage these insights, we propose the multi $Q$-learning algorithm with linear function approximation. We demonstrate that this algorithm converges to the fixed-point of the projected multi-Bellman operator, yielding solutions of arbitrary accuracy. Finally, we validate our approach by applying it to well-known environments, showcasing the effectiveness and applicability of our findings.

View on arXiv PDF

Similar