LG OCNov 9, 2022

Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement Learning

Jernej Hribar, Luke Hackett, Ivana Dusparic

arXiv:2211.04813v23.35 citationsh-index: 23Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of scaling multi-objective optimization for reinforcement learning practitioners, though it is incremental as it builds on existing DQN and W-learning methods.

The paper tackles the scalability issue of multi-objective reinforcement learning in large state spaces by extending the tabular W-learning algorithm with deep neural networks, resulting in Deep W-Networks that outperform DQN baselines and find Pareto fronts in benchmarks like deep sea treasure and multi-objective mountain car.

In this paper, we build on advances introduced by the Deep Q-Networks (DQN) approach to extend the multi-objective tabular Reinforcement Learning (RL) algorithm W-learning to large state spaces. W-learning algorithm can naturally solve the competition between multiple single policies in multi-objective environments. However, the tabular version does not scale well to environments with large state spaces. To address this issue, we replace underlying Q-tables with DQN, and propose an addition of W-Networks, as a replacement for tabular weights (W) representations. We evaluate the resulting Deep W-Networks (DWN) approach in two widely-accepted multi-objective RL benchmarks: deep sea treasure and multi-objective mountain car. We show that DWN solves the competition between multiple policies while outperforming the baseline in the form of a DQN solution. Additionally, we demonstrate that the proposed algorithm can find the Pareto front in both tested environments.

View on arXiv PDF Code

Similar