LGAIJun 1, 2025

State-Covering Trajectory Stitching for Diffusion Planners

arXiv:2506.00895v311 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the problem of limited generalization in diffusion planners for reinforcement learning, particularly in offline goal-conditioned tasks, though it appears incremental as it builds on existing methods.

The paper tackles the limitation of diffusion-based generative models in long-horizon planning due to poor training data quality and diversity, proposing State-Covering Trajectory Stitching (SCoTS) to augment trajectories, which significantly improves performance and generalization on offline goal-conditioned benchmarks.

Diffusion-based generative models are emerging as powerful tools for long-horizon planning in reinforcement learning (RL), particularly with offline datasets. However, their performance is fundamentally limited by the quality and diversity of training data. This often restricts their generalization to tasks outside their training distribution or longer planning horizons. To overcome this challenge, we propose State-Covering Trajectory Stitching (SCoTS), a novel reward-free trajectory augmentation method that incrementally stitches together short trajectory segments, systematically generating diverse and extended trajectories. SCoTS first learns a temporal distance-preserving latent representation that captures the underlying temporal structure of the environment, then iteratively stitches trajectory segments guided by directional exploration and novelty to effectively cover and expand this latent space. We demonstrate that SCoTS significantly improves the performance and generalization capabilities of diffusion planners on offline goal-conditioned benchmarks requiring stitching and long-horizon reasoning. Furthermore, augmented trajectories generated by SCoTS significantly improve the performance of widely used offline goal-conditioned RL algorithms across diverse environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes