LG AINov 13, 2023

Combinatorial Optimization with Policy Adaptation using Latent Space Search

Felix Chalumeau, Shikha Surana, Clement Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett

arXiv:2311.13569v224.046 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of improving heuristic performance for combinatorial optimization, which is incremental as it builds on existing RL methods to enhance search strategies.

The paper tackled the challenge of designing performant algorithms for combinatorial optimization problems by proposing COMPASS, a reinforcement learning approach that parameterizes a diverse policy distribution conditioned on a latent space, and demonstrated that it outperforms state-of-the-art methods on 11 standard tasks and generalizes better on 18 transformed instance distributions.

Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial solvers as the go-to solution. Current approaches emphasize pre-training heuristics that construct solutions but often rely on search procedures with limited variance, such as stochastically sampling numerous solutions from a single policy or employing computationally expensive fine-tuning of the policy on individual problem instances. Building on the intuition that performant search at inference time should be anticipated during pre-training, we propose COMPASS, a novel RL approach that parameterizes a distribution of diverse and specialized policies conditioned on a continuous latent space. We evaluate COMPASS across three canonical problems - Travelling Salesman, Capacitated Vehicle Routing, and Job-Shop Scheduling - and demonstrate that our search strategy (i) outperforms state-of-the-art approaches on 11 standard benchmarking tasks and (ii) generalizes better, surpassing all other approaches on a set of 18 procedurally transformed instance distributions.

View on arXiv PDF Code

Similar