Optimistic Agents are Asymptotically Optimal
This provides a theoretical foundation for optimal reinforcement learning agents, though it appears incremental as it builds on existing optimism principles.
The paper tackles the problem of achieving asymptotically optimal behavior in reinforcement learning across arbitrary finite or compact environment classes, and demonstrates that optimistic agents can accomplish this with finite error bounds in deterministic cases.
We use optimism to introduce generic asymptotically optimal reinforcement learning agents. They achieve, with an arbitrary finite or compact class of environments, asymptotically optimal behavior. Furthermore, in the finite deterministic case we provide finite error bounds.