TEA: Trajectory Encoding Augmentation for Robust and Transferable Policies in Offline Reinforcement Learning
This addresses the challenge of policy generalization in offline RL for applications requiring adaptability to new dynamic conditions, representing an incremental improvement.
The paper tackles the problem of training a single robust policy in offline reinforcement learning that generalizes across environments with unseen dynamics, and the result shows that their Trajectory Encoding Augmentation (TEA) method improves transferability to novel environments compared to using unmodified states.
In this paper, we investigate offline reinforcement learning (RL) with the goal of training a single robust policy that generalizes effectively across environments with unseen dynamics. We propose a novel approach, Trajectory Encoding Augmentation (TEA), which extends the state space by integrating latent representations of environmental dynamics obtained from sequence encoders, such as AutoEncoders. Our findings show that incorporating these encodings with TEA improves the transferability of a single policy to novel environments with new dynamics, surpassing methods that rely solely on unmodified states. These results indicate that TEA captures critical, environment-specific characteristics, enabling RL agents to generalize effectively across dynamic conditions.