Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning
This addresses scalability issues for researchers and practitioners in multi-objective reinforcement learning, though it appears incremental as it builds on dimension reduction techniques.
The paper tackles scalability challenges in multi-objective reinforcement learning by introducing a reward dimension reduction method, demonstrating significant performance improvements over existing methods in environments with up to sixteen objectives.
In this paper, we introduce a simple yet effective reward dimension reduction method to tackle the scalability challenges of multi-objective reinforcement learning algorithms. While most existing approaches focus on optimizing two to four objectives, their abilities to scale to environments with more objectives remain uncertain. Our method uses a dimension reduction approach to enhance learning efficiency and policy performance in multi-objective settings. While most traditional dimension reduction methods are designed for static datasets, our approach is tailored for online learning and preserves Pareto-optimality after transformation. We propose a new training and evaluation framework for reward dimension reduction in multi-objective reinforcement learning and demonstrate the superiority of our method in environments including one with sixteen objectives, significantly outperforming existing online dimension reduction methods.