LGAIMay 31, 2022

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

Tsinghua
arXiv:2205.15701v312 citationsh-index: 49
Originality Highly original
AI Analysis

This work addresses a foundational gap in RL theory by enabling provable guarantees for non-linear representations like neural networks, which is crucial for improving sample efficiency in real-world applications.

The authors tackled the limited theoretical understanding of multitask representation learning in reinforcement learning by extending analysis to general function class representations, achieving provable benefits for bandits and linear MDPs for the first time.

While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear function class, while real-world practice usually adopts general non-linear representation functions like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general function class representations. Specifically, we consider an agent playing $M$ contextual bandits (or MDPs) concurrently and extracting a shared representation function $φ$ from a specific function class $Φ$ using our proposed Generalized Functional Upper Confidence Bound algorithm (GFUCB). We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP for the first time. Lastly, we conduct experiments to demonstrate the effectiveness of our algorithm with neural net representation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes