LGAIJun 22, 2024

Learning Abstract World Model for Value-preserving Planning with Options

arXiv:2406.15850v14 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of building state-action spaces at the correct abstraction level for autonomous agents, though it appears incremental as it leverages existing temporally-extended actions.

The paper tackles the problem of intractable decision-making in general-purpose agents by learning abstract Markov decision processes (MDPs) from sensorimotor experiences, showing that this improves sample efficiency in planning and learning for goal-based navigation environments.

General-purpose agents require fine-grained controls and rich sensory inputs to perform a wide range of tasks. However, this complexity often leads to intractable decision-making. Traditionally, agents are provided with task-specific action and observation spaces to mitigate this challenge, but this reduces autonomy. Instead, agents must be capable of building state-action spaces at the correct abstraction level from their sensorimotor experiences. We leverage the structure of a given set of temporally-extended actions to learn abstract Markov decision processes (MDPs) that operate at a higher level of temporal and state granularity. We characterize state abstractions necessary to ensure that planning with these skills, by simulating trajectories in the abstract MDP, results in policies with bounded value loss in the original MDP. We evaluate our approach in goal-based navigation environments that require continuous abstract states to plan successfully and show that abstract model learning improves the sample efficiency of planning and learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes