Thorsten Koch

h-index9

11papers

76citations

Novelty31%

AI Score43

Ranked #77,633 of 205,806 authors (top 38%)#236 in OC (top 27%)

11 Papers

OCDec 14, 2022

Cutting Plane Selection with Analytic Centers and Multiregression

Mark Turner, Timo Berthold, Mathieu Besançon et al.

Cutting planes are a crucial component of state-of-the-art mixed-integer programming solvers, with the choice of which subset of cuts to add being vital for solver performance. We propose new distance-based measures to qualify the value of a cut by quantifying the extent to which it separates relevant parts of the relaxed feasible set. For this purpose, we use the analytic centers of the relaxation polytope or of its optimal face, as well as alternative optimal solutions of the linear programming relaxation. We assess the impact of the choice of distance measure on root node performance and throughout the whole branch-and-bound tree, comparing our measures against those prevalent in the literature. Finally, by a multi-output regression, we predict the relative performance of each measure, using static features readily available before the separation process. Our results indicate that analytic center-based methods help to significantly reduce the number of branch-and-bound nodes needed to explore the search space and that our multiregression approach can further improve on any individual method.

OCDec 13, 2023Code

PySCIPOpt-ML: Embedding Trained Machine Learning Models into Mixed-Integer Programs

Mark Turner, Antonia Chmiela, Thorsten Koch et al.

A standard tool for modelling real-world optimisation problems is mixed-integer programming (MIP). However, for many of these problems, information about the relationships between variables is either incomplete or highly complex, making it difficult or even impossible to model the problem directly. To overcome these hurdles, machine learning (ML) predictors are often used to represent these relationships and are then embedded in the MIP as surrogate models. Due to the large amount of available ML frameworks and the complexity of many ML predictors, formulating such predictors into MIPs is a highly non-trivial task. In this paper, we introduce PySCIPOpt-ML, an open-source tool for the automatic formulation and embedding of trained ML predictors into MIPs. By directly interfacing with a broad range of commonly used ML frameworks and an open-source MIP solver, PySCIPOpt-ML provides a way to easily integrate ML constraints into optimisation problems. Alongside PySCIPOpt-ML, we introduce, SurrogateLIB, a library of MIP instances with embedded ML constraints, and present computational results over SurrogateLIB, providing intuition on the scale of ML predictors that can be practically embedded. The project is available at https://github.com/Opt-Mucca/PySCIPOpt-ML.

49.0DLMay 12

Reconnecting Fragmented Citation Networks with Semantic Augmentation

Vu Thi Huong, Annika Buchholz, Imene Khebouri et al.

Citation graphs are fundamental tools for modeling scientific structure, but are often fragmented due to missing citations of scientifically connected articles. To address this issue, we propose a computationally efficient hybrid framework integrating citation topology with large language model (LLM)-based text similarity. Using 662,369 Web of Science publications in Mathematics and Operations Research & Management Science, we augment the original graph by adding semantic edges from small, disconnected components and weighting existing citations according to textual similarity. Semantic augmentation substantially reduces fragmentation while preserving disciplinary homogeneity. Compared to embedding-only clustering, cluster detection on augmented graphs using the Leiden algorithm retains structural interpretability while offering multi-scale organization. The method scales efficiently to large datasets and offers a practical strategy for strengthening citation-based indicators without collapsing disciplinary boundaries.

OCFeb 22, 2022Code

Adaptive Cut Selection in Mixed-Integer Linear Programming

Mark Turner, Thorsten Koch, Felipe Serrano et al.

Cutting plane selection is a subroutine used in all modern mixed-integer linear programming solvers with the goal of selecting a subset of generated cuts that induce optimal solver performance. These solvers have millions of parameter combinations, and so are excellent candidates for parameter tuning. Cut selection scoring rules are usually weighted sums of different measurements, where the weights are parameters. We present a parametric family of mixed-integer linear programs together with infinitely many family-wide valid cuts. Some of these cuts can induce integer optimal solutions directly after being applied, while others fail to do so even if an infinite amount are applied. We show for a specific cut selection rule, that any finite grid search of the parameter space will always miss all parameter values, which select integer optimal inducing cuts in an infinite amount of our problems. We propose a variation on the design of existing graph convolutional neural networks, adapting them to learn cut selection rule parameters. We present a reinforcement learning framework for selecting cuts, and train our design using said framework over MIPLIB 2017 and a neural network verification data set. Our framework and design show that adaptive cut selection does substantially improve performance over a diverse set of instances, but that finding a single function describing such a rule is difficult. Code for reproducing all experiments is available at https://github.com/Opt-Mucca/Adaptive-Cutsel-MILP.

LGDec 4, 2025

Multi-Agent Reinforcement Learning for Intraday Operating Rooms Scheduling under Uncertainty

Kailiang Liu, Ying Chen, Ralf Borndörfer et al.

Intraday surgical scheduling is a multi-objective decision problem under uncertainty-balancing elective throughput, urgent and emergency demand, delays, sequence-dependent setups, and overtime. We formulate the problem as a cooperative Markov game and propose a multi-agent reinforcement learning (MARL) framework in which each operating room (OR) is an agent trained with centralized training and decentralized execution. All agents share a policy trained via Proximal Policy Optimization (PPO), which maps rich system states to actions, while a within-epoch sequential assignment protocol constructs conflict-free joint schedules across ORs. A mixed-integer pre-schedule provides reference starting times for electives; we impose type-specific quadratic delay penalties relative to these references and a terminal overtime penalty, yielding a single reward that captures throughput, timeliness, and staff workload. In simulations reflecting a realistic hospital mix (six ORs, eight surgery types, random urgent and emergency arrivals), the learned policy outperforms six rule-based heuristics across seven metrics and three evaluation subsets, and, relative to an ex post MIP oracle, quantifies optimality gaps. Policy analytics reveal interpretable behavior-prioritizing emergencies, batching similar cases to reduce setups, and deferring lower-value electives. We also derive a suboptimality bound for the sequential decomposition under simplifying assumptions. We discuss limitations-including OR homogeneity and the omission of explicit staffing constraints-and outline extensions. Overall, the approach offers a practical, interpretable, and tunable data-driven complement to optimization for real-time OR scheduling.

CLFeb 4

Mapping the Web of Science, a large-scale graph and text-based dataset with LLM embeddings

Tim Kunt, Annika Buchholz, Imene Khebouri et al.

Large text data sets, such as publications, websites, and other text-based media, inherit two distinct types of features: (1) the text itself, its information conveyed through semantics, and (2) its relationship to other texts through links, references, or shared attributes. While the latter can be described as a graph structure and can be handled by a range of established algorithms for classification and prediction, the former has recently gained new potential through the use of LLM embedding models. Demonstrating these possibilities and their practicability, we investigate the Web of Science dataset, containing ~56 million scientific publications through the lens of our proposed embedding method, revealing a self-structured landscape of texts.

OCJun 4, 2025

Vu Thi Huong, Ida Litzel, Thorsten Koch

Fuzzy clustering, which allows an article to belong to multiple clusters with soft membership degrees, plays a vital role in analyzing publication data. This problem can be formulated as a constrained optimization model, where the goal is to minimize the discrepancy between the similarity observed from data and the similarity derived from a predicted distribution. While this approach benefits from leveraging state-of-the-art optimization algorithms, tailoring them to work with real, massive databases like OpenAlex or Web of Science - containing about 70 million articles and a billion citations - poses significant challenges. We analyze potentials and challenges of the approach from both mathematical and computational perspectives. Among other things, second-order optimality conditions are established, providing new theoretical insights, and practical solution methods are proposed by exploiting the structure of the problem. Specifically, we accelerate the gradient projection method using GPU-based parallel computing to efficiently handle large-scale data.

DLMay 15, 2025

Clustering scientific publications: lessons learned through experiments with a real citation network

Vu Thi Huong, Thorsten Koch

Clustering scientific publications can reveal underlying research structures within bibliographic databases. Graph-based clustering methods, such as spectral, Louvain, and Leiden algorithms, are frequently utilized due to their capacity to effectively model citation networks. However, their performance may degrade when applied to real-world data. This study evaluates the performance of these clustering algorithms on a citation graph comprising approx. 700,000 papers and 4.6 million citations extracted from Web of Science. The results show that while scalable methods like Louvain and Leiden perform efficiently, their default settings often yield poor partitioning. Meaningful outcomes require careful parameter tuning, especially for large networks with uneven structures, including a dense core and loosely connected papers. These findings highlight practical lessons about the challenges of large-scale data, method selection and tuning based on specific structures of bibliometric clustering tasks.

SEAug 25, 2021

AppSecure.nrw Software Security Study

Stefan Dziwok, Thorsten Koch, Sven Merschjohann et al.

In recent years, the World Economic Forum has identified software security as the most significant technological risk to the world's population, as software-intensive systems process critical data and provide critical services. This raises the question of the extent to which German companies are addressing software security in developing and operating their software products. This paper reports on the results of an extensive study among developers, product owners, and managers to answer this question. Our results show that ensuring security is a multi-faceted challenge for companies, involving low awareness, inaccurate self-assessment, and a lack of competence on the topic of secure software development among all stakeholders. The current situation in software development is therefore detrimental to the security of software products in the medium and long term.

MEFeb 18, 2021

The Variational Bayesian Inference for Network Autoregression Models

Wei-Ting Lai, Ray-Bing Chen, Ying Chen et al.

We develop a variational Bayesian (VB) approach for estimating large-scale dynamic network models in the network autoregression framework. The VB approach allows for the automatic identification of the dynamic structure of such a model and obtains a direct approximation of the posterior density. Compared to Markov Chain Monte Carlo (MCMC) based sampling approaches, the VB approach achieves enhanced computational efficiency without sacrificing estimation accuracy. In the simulation study conducted here, the proposed VB approach detects various types of proper active structures for dynamic network models. Compared to the alternative approach, the proposed method achieves similar or better accuracy, and its computational time is halved. In a real data analysis scenario of day-ahead natural gas flow prediction in the German gas transmission network with 51 nodes between October 2013 and September 2015, the VB approach delivers promising forecasting accuracy along with clearly detected structures in terms of dynamic dependence.

OCFeb 3, 2021

Generative deep learning for decision making in gas networks

Lovis Anderson, Mark Turner, Thorsten Koch

A decision support system relies on frequent re-solving of similar problem instances. While the general structure remains the same in corresponding applications, the input parameters are updated on a regular basis. We propose a generative neural network design for learning integer decision variables of mixed-integer linear programming (MILP) formulations of these problems. We utilise a deep neural network discriminator and a MILP solver as our oracle to train our generative neural network. In this article, we present the results of our design applied to the transient gas optimisation problem. With the trained network we produce a feasible solution in 2.5s, use it as a warm-start solution, and thereby decrease global optimal solution solve time by 60.5%.