LG AIMay 31, 2023

Representation-Driven Reinforcement Learning

Ofir Nabati, Guy Tennenholtz, Shie Mannor

arXiv:2305.19922v39.84 citations

Originality Incremental advance

AI Analysis

This work addresses a fundamental challenge in reinforcement learning for AI systems, though it appears incremental as it builds on existing techniques like contextual bandits.

The paper tackles the exploration-exploitation problem in reinforcement learning by reframing it as a representation-exploitation problem, leading to significantly improved performance in evolutionary and policy gradient-based approaches compared to traditional methods.

We present a representation-driven framework for reinforcement learning. By representing policies as estimates of their expected values, we leverage techniques from contextual bandits to guide exploration and exploitation. Particularly, embedding a policy network into a linear feature space allows us to reframe the exploration-exploitation problem as a representation-exploitation problem, where good policy representations enable optimal exploration. We demonstrate the effectiveness of this framework through its application to evolutionary and policy gradient-based approaches, leading to significantly improved performance compared to traditional methods. Our framework provides a new perspective on reinforcement learning, highlighting the importance of policy representation in determining optimal exploration-exploitation strategies.

View on arXiv PDF

Similar