LGSYJan 21, 2024

Solving Offline Reinforcement Learning with Decision Tree Regression

arXiv:2401.11630v23 citationsCoRL
Originality Incremental advance
AI Analysis

This addresses offline RL problems for robotics and locomotion domains, offering a fast and explainable approach, though it appears incremental as it adapts existing methods to a new context.

The paper tackles offline reinforcement learning by reframing it as a regression task solvable with Decision Trees, introducing two frameworks (RCDTP and RWDTP) that achieve training in under a few minutes while performing at least as well as established methods on D4RL and robotic tasks.

This study presents a novel approach to addressing offline reinforcement learning (RL) problems by reframing them as regression tasks that can be effectively solved using Decision Trees. Mainly, we introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies (RCDTP and RWDTP), both of which achieve notable speed in agent training as well as inference, with training typically lasting less than a few minutes. Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods. We evaluate our methods on D4RL datasets for locomotion and manipulation, as well as other robotic tasks involving wheeled and flying robots. Additionally, we assess performance in delayed/sparse reward scenarios and highlight the explainability of these policies through action distribution and feature importance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes