AIFLJun 23, 2023

Reinforcement Learning with Temporal-Logic-Based Causal Diagrams

arXiv:2306.13732v15 citationsh-index: 53
Originality Incremental advance
AI Analysis

This addresses the challenge of inefficient exploration in RL for temporally extended tasks, though it appears incremental by building on existing DFA-based methods.

The paper tackles the problem of reinforcement learning with temporally extended goals by proposing Temporal-Logic-based Causal Diagrams (TL-CDs) to incorporate causal knowledge, resulting in significantly reduced exploration and faster convergence to optimal policies in case studies.

We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the environment. To address this limitation, we propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment. We exploit the TL-CD to devise an RL algorithm in which an agent requires significantly less exploration of the environment. To this end, based on a TL-CD and a task DFA, we identify configurations where the agent can determine the expected rewards early during an exploration. Through a series of case studies, we demonstrate the benefits of using TL-CDs, particularly the faster convergence of the algorithm to an optimal policy due to reduced exploration of the environment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes