Code Retrieval for MILP Instance Generation
This work addresses the need for efficient and flexible MILP instance generation to support learning-based solvers in fields like scheduling and logistics, offering a novel approach but with incremental improvements over existing methods.
The paper tackles the problem of generating high-quality Mixed-Integer Linear Programming (MILP) instances for learning-based solvers by reformulating it as a code generation task, introducing MILP-EmbedSim for similarity measurement and MILP-Retrieval for code retrieval, which outperforms baselines in both code and instance generation tasks.
Mixed-Integer Linear Programming (MILP) is widely used in fields such as scheduling, logistics, and planning. Enhancing the performance of MILP solvers, particularly learning-based solvers, requires substantial amounts of high-quality data. However, existing methods for MILP instance generation typically necessitate training a separate model for each problem class and are computationally intensive when generating new instances. To address these limitations, we reformulate the MILP Instance Generation task as MILP Code Generation task, enabling efficient, flexible, and interpretable instance generation through code. Since MILP instances generated from code can vary significantly in scale, we introduce MILP-EmbedSim, a new similarity metric that accurately measures the similarity between instances of varying sizes within the same problem class. Leveraging this metric, we propose MILP-Retrieval, a pipeline that retrieves generation code from library to produce MILP instances highly similar to target instance. MILP-Retrieval outperforms baselines in both MILP Code Generation and Instance Generation tasks, provides a novel perspective on MILP instance generation and opens new possibilities for learning-based solvers.