LGAIMLFeb 10, 2023

Robust Knowledge Transfer in Tiered Reinforcement Learning

arXiv:2302.05534v31 citationsh-index: 29
Originality Highly original
AI Analysis

This work addresses robust knowledge transfer in reinforcement learning for scenarios with parallel tasks, offering provable benefits without prior similarity knowledge, though it is incremental in extending to multiple low-tier tasks.

The paper tackles the problem of transferring knowledge from low-tier to high-tier tasks in tiered reinforcement learning without assuming shared dynamics or rewards, achieving constant regret on partial states for the high-tier task depending on task similarity and near-optimal regret when tasks are dissimilar, while maintaining near-optimal performance for the low-tier task.

In this paper, we study the Tiered Reinforcement Learning setting, a parallel transfer learning framework, where the goal is to transfer knowledge from the low-tier (source) task to the high-tier (target) task to reduce the exploration risk of the latter while solving the two tasks in parallel. Unlike previous work, we do not assume the low-tier and high-tier tasks share the same dynamics or reward functions, and focus on robust knowledge transfer without prior knowledge on the task similarity. We identify a natural and necessary condition called the ``Optimal Value Dominance'' for our objective. Under this condition, we propose novel online learning algorithms such that, for the high-tier task, it can achieve constant regret on partial states depending on the task similarity and retain near-optimal regret when the two tasks are dissimilar, while for the low-tier task, it can keep near-optimal without making sacrifice. Moreover, we further study the setting with multiple low-tier tasks, and propose a novel transfer source selection mechanism, which can ensemble the information from all low-tier tasks and allow provable benefits on a much larger state-action space.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes