LG MLJun 29, 2020

Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

Jianzhun Du, Joseph Futoma, Finale Doshi-Velez

arXiv:2006.16210v217.461 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a domain-specific challenge in reinforcement learning for continuous-time control, offering incremental improvements over existing methods.

The paper tackled the problem of modeling continuous-time dynamics in semi-Markov decision processes using neural ODEs, resulting in high-performing policies with small data and optimized time schedules that reduce interaction rates while maintaining near-optimal performance.

We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary differential equations (ODEs). Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data. We also develop a model-based approach for optimizing time schedules to reduce interaction rates with the environment while maintaining the near-optimal performance, which is not possible for model-free methods. We experimentally demonstrate the efficacy of our methods across various continuous-time domains.

View on arXiv PDF Code

Similar