LGAISep 20, 2022

Towards Task-Prioritized Policy Composition

arXiv:2209.09536v1h-index: 27
Originality Highly original
AI Analysis

This work addresses the problem of inefficient and non-modular policy composition in RL for researchers and practitioners, offering a novel approach that is not incremental but introduces a new paradigm.

The paper tackles the lack of prioritized policy composition methods in Reinforcement Learning (RL) by proposing a novel framework based on the concept of indifference-space, enabling modular design and knowledge transfer. The result is a method that increases data efficiency and ensures high-priority constraint satisfaction, making it suitable for safety-critical domains like robotics.

Combining learned policies in a prioritized, ordered manner is desirable because it allows for modular design and facilitates data reuse through knowledge transfer. In control theory, prioritized composition is realized by null-space control, where low-priority control actions are projected into the null-space of high-priority control actions. Such a method is currently unavailable for Reinforcement Learning. We propose a novel, task-prioritized composition framework for Reinforcement Learning, which involves a novel concept: The indifferent-space of Reinforcement Learning policies. Our framework has the potential to facilitate knowledge transfer and modular design while greatly increasing data efficiency and data reuse for Reinforcement Learning agents. Further, our approach can ensure high-priority constraint satisfaction, which makes it promising for learning in safety-critical domains like robotics. Unlike null-space control, our approach allows learning globally optimal policies for the compound task by online learning in the indifference-space of higher-level policies after initial compound policy construction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes