CausalCOMRL: Context-Based Offline Meta-Reinforcement Learning with Causal Representation
This addresses a specific issue in offline meta-RL for improving agent generalizability, but it appears incremental as it builds on existing context-based methods with causal enhancements.
The paper tackled the problem of spurious correlations degrading policy performance in context-based offline meta-reinforcement learning by proposing CausalCOMRL, which integrates causal representation learning to uncover causal relationships and enhance generalizability, achieving better performance than other methods on most benchmarks.
Context-based offline meta-reinforcement learning (OMRL) methods have achieved appealing success by leveraging pre-collected offline datasets to develop task representations that guide policy learning. However, current context-based OMRL methods often introduce spurious correlations, where task components are incorrectly correlated due to confounders. These correlations can degrade policy performance when the confounders in the test task differ from those in the training task. To address this problem, we propose CausalCOMRL, a context-based OMRL method that integrates causal representation learning. This approach uncovers causal relationships among the task components and incorporates the causal relationships into task representations, enhancing the generalizability of RL agents. We further improve the distinction of task representations from different tasks by using mutual information optimization and contrastive learning. Utilizing these causal task representations, we employ SAC to optimize policies on meta-RL benchmarks. Experimental results show that CausalCOMRL achieves better performance than other methods on most benchmarks.