AIOct 24, 2025

AutoOpt: A Dataset and a Unified Framework for Automating Optimization Problem Solving

Ankur Sinha, Shobhit Arora, Dhaval Pujara

arXiv:2510.21436v12 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses the problem of automating optimization problem solving for researchers and practitioners, though it appears incremental as it builds on existing methods like deep learning and hybrid optimization.

This study introduces AutoOpt-11k, a dataset of over 11,000 images of optimization problems with LaTeX and modeling language labels, and the AutoOpt framework, which automates solving optimization problems from images using deep learning and a hybrid optimization method, outperforming ChatGPT, Gemini, and Nougat on BLEU score and yielding better results on complex problems compared to common algorithms.

This study presents AutoOpt-11k, a unique image dataset of over 11,000 handwritten and printed mathematical optimization models corresponding to single-objective, multi-objective, multi-level, and stochastic optimization problems exhibiting various types of complexities such as non-linearity, non-convexity, non-differentiability, discontinuity, and high-dimensionality. The labels consist of the LaTeX representation for all the images and modeling language representation for a subset of images. The dataset is created by 25 experts following ethical data creation guidelines and verified in two-phases to avoid errors. Further, we develop AutoOpt framework, a machine learning based automated approach for solving optimization problems, where the user just needs to provide an image of the formulation and AutoOpt solves it efficiently without any further human intervention. AutoOpt framework consists of three Modules: (i) M1 (Image_to_Text)- a deep learning model performs the Mathematical Expression Recognition (MER) task to generate the LaTeX code corresponding to the optimization formulation in image; (ii) M2 (Text_to_Text)- a small-scale fine-tuned LLM generates the PYOMO script (optimization modeling language) from LaTeX code; (iii) M3 (Optimization)- a Bilevel Optimization based Decomposition (BOBD) method solves the optimization formulation described in the PYOMO script. We use AutoOpt-11k dataset for training and testing of deep learning models employed in AutoOpt. The deep learning model for MER task (M1) outperforms ChatGPT, Gemini and Nougat on BLEU score metric. BOBD method (M3), which is a hybrid approach, yields better results on complex test problems compared to common approaches, like interior-point algorithm and genetic algorithm.

View on arXiv PDF

Similar