LGMLJun 17, 2020

Learning What to Defer for Maximum Independent Sets

arXiv:2006.09607v293 citations
AI Analysis

This addresses the problem of applying DRL to large-scale combinatorial optimization, such as the maximum independent set problem, with incremental improvements in efficiency for researchers and practitioners in fields like computer science and operations research.

The paper tackles the scalability issue of deep reinforcement learning (DRL) solvers for combinatorial optimization by proposing a novel scheme called learning what to defer (LwD), which adaptively adjusts the number of decision stages to handle large-scale graphs, demonstrating significant improvement over state-of-the-art DRL and outperforming conventional solvers on graphs with millions of vertices under time constraints.

Designing efficient algorithms for combinatorial optimization appears ubiquitously in various scientific fields. Recently, deep reinforcement learning (DRL) frameworks have gained considerable attention as a new approach: they can automate the design of a solver while relying less on sophisticated domain knowledge of the target problem. However, the existing DRL solvers determine the solution using a number of stages proportional to the number of elements in the solution, which severely limits their applicability to large-scale graphs. In this paper, we seek to resolve this issue by proposing a novel DRL scheme, coined learning what to defer (LwD), where the agent adaptively shrinks or stretch the number of stages by learning to distribute the element-wise decisions of the solution at each stage. We apply the proposed framework to the maximum independent set (MIS) problem, and demonstrate its significant improvement over the current state-of-the-art DRL scheme. We also show that LwD can outperform the conventional MIS solvers on large-scale graphs having millions of vertices, under a limited time budget.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes