OC LG NEMar 23, 2023

RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research

arXiv:2303.13117v15 citationsh-index: 9Has Code

Originality Synthesis-oriented

AI Analysis

This work provides a flexible framework for researchers and practitioners in operation research to incorporate recent RL advances, though it is incremental as it builds on existing models.

The authors tackled the lack of flexibility in applying reinforcement learning to operation research problems by introducing RLOR, a framework that re-implements models like the Attention Model with PPO, achieving at least 8 times speed up in training time.

Reinforcement learning has been applied in operation research and has shown promise in solving large combinatorial optimization problems. However, existing works focus on developing neural network architectures for certain problems. These works lack the flexibility to incorporate recent advances in reinforcement learning, as well as the flexibility of customizing model architectures for operation research problems. In this work, we analyze the end-to-end autoregressive models for vehicle routing problems and show that these models can benefit from the recent advances in reinforcement learning with a careful re-implementation of the model architecture. In particular, we re-implemented the Attention Model and trained it with Proximal Policy Optimization (PPO) in CleanRL, showing at least 8 times speed up in training time. We hereby introduce RLOR, a flexible framework for Deep Reinforcement Learning for Operation Research. We believe that a flexible framework is key to developing deep reinforcement learning models for operation research problems. The code of our work is publicly available at https://github.com/cpwan/RLOR.

View on arXiv PDF Code

Similar