LGMay 4, 2025

D3HRL: A Distributed Hierarchical Reinforcement Learning Approach Based on Causal Discovery and Spurious Correlation Detection

Chenran Zhao, Dianxi Shi, Mengzhu Wang, Jianqiang Xia, Huanhuan Yang, Songchang Jin, Shaowu Yang, Chunping Qiu

arXiv:2505.01979v17.11 citationsh-index: 10Neural Networks

Originality Incremental advance

AI Analysis

This addresses challenges in HRL for complex sequential decision-making tasks, representing an incremental improvement by integrating causal methods into existing frameworks.

The paper tackles delay effects and spurious correlations in Hierarchical Reinforcement Learning (HRL) by proposing D3HRL, a causal HRL approach that uses distributed causal discovery and conditional independence testing to identify true causal relationships, resulting in superior sensitivity to delays and accurate causal identification in 2D-MineCraft and MiniGrid experiments.

Current Hierarchical Reinforcement Learning (HRL) algorithms excel in long-horizon sequential decision-making tasks but still face two challenges: delay effects and spurious correlations. To address them, we propose a causal HRL approach called D3HRL. First, D3HRL models delayed effects as causal relationships across different time spans and employs distributed causal discovery to learn these relationships. Second, it employs conditional independence testing to eliminate spurious correlations. Finally, D3HRL constructs and trains hierarchical policies based on the identified true causal relationships. These three steps are iteratively executed, gradually exploring the complete causal chain of the task. Experiments conducted in 2D-MineCraft and MiniGrid show that D3HRL demonstrates superior sensitivity to delay effects and accurately identifies causal relationships, leading to reliable decision-making in complex environments.

View on arXiv PDF

Similar