Where to Intervene: Action Selection in Deep Reinforcement Learning
This addresses a critical bottleneck in RL for unknown and complex environments, offering a generalizable solution that reduces reliance on expert design and computational complexity.
The paper tackles the problem of high-dimensional action selection in deep reinforcement learning by proposing a data-driven approach that selects minimal sufficient actions and controls false discovery rates, achieving superior performance in variable selection and rewards compared to alternative techniques.
Deep reinforcement learning (RL) has gained widespread adoption in recent years but faces significant challenges, particularly in unknown and complex environments. Among these, high-dimensional action selection stands out as a critical problem. Existing works often require a sophisticated prior design to eliminate redundancy in the action space, relying heavily on domain expert experience or involving high computational complexity, which limits their generalizability across different RL tasks. In this paper, we address these challenges by proposing a general data-driven action selection approach with model-free and computationally friendly properties. Our method not only selects minimal sufficient actions but also controls the false discovery rate via knockoff sampling. More importantly, we seamlessly integrate the action selection into deep RL methods during online training. Empirical experiments validate the established theoretical guarantees, demonstrating that our method surpasses various alternative techniques in terms of both performance in variable selection and overall achieved rewards.