LGMLSep 8, 2020

Graph neural networks-based Scheduler for Production planning problems using Reinforcement Learning

arXiv:2009.03836v248 citations
Originality Incremental advance
AI Analysis

This work addresses production planning inefficiencies for manufacturing industries by improving scheduling accuracy and generalization, though it is incremental as it builds on existing RL and GNN methods.

The authors tackled job shop scheduling problems by proposing GraSP-RL, a framework that uses graph neural networks with reinforcement learning to represent and optimize production planning, and it outperformed baseline methods like FIFO, tabu search, and genetic algorithms in minimizing makespan for a 30-job, 4-machine scenario.

Reinforcement learning (RL) is increasingly adopted in job shop scheduling problems (JSSP). But RL for JSSP is usually done using a vectorized representation of machine features as the state space. It has three major problems: (1) the relationship between the machine units and the job sequence is not fully captured, (2) exponential increase in the size of the state space with increasing machines/jobs, and (3) the generalization of the agent to unseen scenarios. We present a novel framework - GraSP-RL, GRAph neural network-based Scheduler for Production planning problems using Reinforcement Learning. It represents JSSP as a graph and trains the RL agent using features extracted using a graph neural network (GNN). While the graph is itself in the non-euclidean space, the features extracted using the GNNs provide a rich encoding of the current production state in the euclidean space, which is then used by the RL agent to select the next job. Further, we cast the scheduling problem as a decentralized optimization problem in which the learning agent is assigned to all the production units and the agent learns asynchronously from the data collected on all the production units. The GraSP-RL is then applied to a complex injection molding production environment with 30 jobs and 4 machines. The task is to minimize the makespan of the production plan. The schedule planned by GraSP-RL is then compared and analyzed with a priority dispatch rule algorithm like first-in-first-out (FIFO) and metaheuristics like tabu search (TS) and genetic algorithm (GA). The proposed GraSP-RL outperforms the FIFO, TS, and GA for the trained task of planning 30 jobs in JSSP. We further test the generalization capability of the trained agent on two different problem classes: Open shop system (OSS) and Reactive JSSP (RJSSP) where our method produces results better than FIFO and comparable results to TS and GA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes