ROAILGSYApr 1, 2019

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

arXiv:1904.01068v115 citations
Originality Incremental advance
AI Analysis

This addresses safety-critical exploration for reinforcement learning agents in domains like robotics, though it appears incremental as it builds on existing safe exploration techniques.

The paper tackles safe exploration in deterministic Markov Decision Processes with unknown transition models by proposing an algorithm that guarantees deterministic safety using Lipschitz-continuity and optimizes action efficiency, demonstrating performance in navigation simulations.

We propose a safe exploration algorithm for deterministic Markov Decision Processes with unknown transition models. Our algorithm guarantees safety by leveraging Lipschitz-continuity to ensure that no unsafe states are visited during exploration. Unlike many other existing techniques, the provided safety guarantee is deterministic. Our algorithm is optimized to reduce the number of actions needed for exploring the safe space. We demonstrate the performance of our algorithm in comparison with baseline methods in simulation on navigation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes