Lifelong Learning for Neural powered Mixed Integer Programming
This addresses the problem of maintaining performance over time for practitioners solving NP-hard optimization problems, but it is incremental as it builds on existing learning-to-branch techniques.
The paper tackles catastrophic forgetting in learning-to-branch methods for Mixed Integer Programs (MIPs) when training data arrives continually, proposing LIMIP, which uses a bipartite Graph Attention Network with knowledge distillation and elastic weight consolidation, resulting in up to 50% better performance in lifelong learning scenarios compared to baselines.
Mixed Integer programs (MIPs) are typically solved by the Branch-and-Bound algorithm. Recently, Learning to imitate fast approximations of the expert strong branching heuristic has gained attention due to its success in reducing the running time for solving MIPs. However, existing learning-to-branch methods assume that the entire training data is available in a single session of training. This assumption is often not true, and if the training data is supplied in continual fashion over time, existing techniques suffer from catastrophic forgetting. In this work, we study the hitherto unexplored paradigm of Lifelong Learning to Branch on Mixed Integer Programs. To mitigate catastrophic forgetting, we propose LIMIP, which is powered by the idea of modeling an MIP instance in the form of a bipartite graph, which we map to an embedding space using a bipartite Graph Attention Network. This rich embedding space avoids catastrophic forgetting through the application of knowledge distillation and elastic weight consolidation, wherein we learn the parameters key towards retaining efficacy and are therefore protected from significant drift. We evaluate LIMIP on a series of NP-hard problems and establish that in comparison to existing baselines, LIMIP is up to 50% better when confronted with lifelong learning.