J. Kyle Brubaker

h-index5

8papers

360citations

Novelty39%

AI Score45

Ranked #44,522 of 194,257 authors (top 23%)#10,294 in LG (top 26%)

8 Papers

5.4AIJun 6, 2023

Explainable AI using expressive Boolean formulas

Gili Rosenberg, J. Kyle Brubaker, Martin J. A. Schuetz et al.

We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability), according to which input data are classified. Such a formula can include any operator that can be applied to one or more Boolean variables, thus providing higher expressivity compared to more rigid rule-based and tree-based approaches. The classifier is trained using native local optimization techniques, efficiently searching the space of feasible formulas. Shallow rules can be determined by fast Integer Linear Programming (ILP) or Quadratic Unconstrained Binary Optimization (QUBO) solvers, potentially powered by special purpose hardware or quantum devices. We combine the expressivity and efficiency of the native local optimizer with the fast operation of these devices by executing non-local moves that optimize over subtrees of the full Boolean formula. We provide extensive numerical benchmarking results featuring several baselines on well-known public datasets. Based on the results, we find that the native local rule classifier is generally competitive with the other classifiers. The addition of non-local moves achieves similar results with fewer iterations, and therefore using specialized or quantum hardware could lead to a speedup by fast proposal of non-local moves.

8.6DIS-NNFeb 3, 2023

Reply to: Modern graph neural networks do worse than classical greedy algorithms in solving combinatorial optimization problems like maximum independent set

Martin J. A. Schuetz, J. Kyle Brubaker, Helmut G. Katzgraber

We provide a comprehensive reply to the comment written by Chiara Angelini and Federico Ricci-Tersenghi [arXiv:2206.13211] and argue that the comment singles out one particular non-representative example problem, entirely focusing on the maximum independent set (MIS) on sparse graphs, for which greedy algorithms are expected to perform well. Conversely, we highlight the broader algorithmic development underlying our original work, and (within our original framework) provide additional numerical results showing sizable improvements over our original results, thereby refuting the comment's performance statements. We also provide results showing run-time scaling superior to the results provided by Angelini and Ricci-Tersenghi. Furthermore, we show that the proposed set of random d-regular graphs does not provide a universal set of benchmark instances, nor do greedy heuristics provide a universal algorithmic baseline. Finally, we argue that the internal (parallel) anatomy of graph neural networks is very different from the (sequential) nature of greedy algorithms and emphasize that graph neural networks have demonstrated their potential for superior scalability compared to existing heuristics such as parallel tempering. We conclude by discussing the conceptual novelty of our work and outline some potential extensions.

8.8LGFeb 3, 2023

Reply to: Inability of a graph neural network heuristic to outperform greedy algorithms in solving combinatorial optimization problems

Martin J. A. Schuetz, J. Kyle Brubaker, Helmut G. Katzgraber

We provide a comprehensive reply to the comment written by Stefan Boettcher [arXiv:2210.00623] and argue that the comment singles out one particular non-representative example problem, entirely focusing on the maximum cut problem (MaxCut) on sparse graphs, for which greedy algorithms are expected to perform well. Conversely, we highlight the broader algorithmic development underlying our original work, and (within our original framework) provide additional numerical results showing sizable improvements over our original data, thereby refuting the comment's original performance statements. Furthermore, it has already been shown that physics-inspired graph neural networks (PI-GNNs) can outperform greedy algorithms, in particular on hard, dense instances. We also argue that the internal (parallel) anatomy of graph neural networks is very different from the (sequential) nature of greedy algorithms, and (based on their usage at the scale of real-world social networks) point out that graph neural networks have demonstrated their potential for superior scalability compared to existing heuristics such as extremal optimization. Finally, we conclude highlighting the conceptual novelty of our work and outline some potential extensions.

6.5OCApr 14

Applying a Random-Key Optimizer on Mixed Integer Programs

Antonio A. Chaves, Mauricio G. C. Resende, Carise E. Schmidt et al.

Mixed-Integer Programs (MIPs) are NP-hard optimization models that arise in a broad range of decision-making applications, including finance, logistics, energy systems, and network design. Although modern commercial solvers have achieved remarkable progress and perform effectively on many small- and medium-sized instances, their performance often degrades when confronted with large-cale or highly constrained formulations. This paper explores the use of the Random-Key Optimizer (RKO) framework as a flexible, metaheuristic alternative for computing high-quality solutions to MIPs through the design of problem-specific decoders. The proposed approach separates the search process from feasibility enforcement by operating in a continuous random-key space while mapping candidate solutions to feasible integer solutions via efficient decoding procedures. We evaluate the methodology on two representative and structurally distinct benchmark problems: the mean-variance Markowitz portfolio optimization problem with buy-in and cardinality constraints, and the Time-Dependent Traveling Salesman Problem. For each formulation, tailored decoders are developed to reduce the effective search space, promote feasibility, and accelerate convergence. Computational experiments demonstrate that RKO consistently produces competitive, and in several cases superior, solutions compared to a state-of-the-art commercial MIP solver, both in terms of solution quality and computational time. These results highlight the potential of RKO as a scalable and versatile heuristic framework for tackling challenging large-scale MIPs.

6.0LGApr 27Code

Crystal structure prediction using graph neural combinatorial optimization

Stavros Gerolymatos, J. Kyle Brubaker, Martin J. A. Schuetz et al.

Crystalline materials are widely used in technological applications, yet their discovery remains a significant challenge. As their properties are driven by structure, crystal structure prediction (CSP) methods play a central role in computational approaches aiming to accelerate this process. Previously, CSP has been approached from a combinatorial optimization perspective, with the core challenge of allocating atoms on a fine grid of predefined discrete positions within a unit cell while minimizing their interaction energy. Exact mathematical optimization methods provide guaranteed solutions, but they become computationally expensive for large-scale instances, where the atomic configuration space grows rapidly, particularly in the absence of additional symmetry constraints. In this work, we introduce a neural combinatorial optimization approach to the atom allocation challenge and, subsequently, CSP, based on graph neural networks (GNNs), which can effectively sample from the distribution of feasible structures in an unsupervised manner. We leverage expander graphs to construct computational graphs over discrete positions that capture both short- and long-range interactions between atoms, and employ the Gumbel-Sinkhorn approach to enforce the desired stoichiometry of the generated structures. We demonstrate that our method outperforms classical heuristic approaches and is competitive with a commercial optimization solver across a range of chemical compositions. This enables the use of ever-expanding GPU infrastructure to tackle the inherent combinatorial challenges of CSP, paving the way for scaling beyond current capabilities.

4.6LGNov 26, 2024

Scalable iterative pruning of large language and vision models using block coordinate descent

Gili Rosenberg, J. Kyle Brubaker, Martin J. A. Schuetz et al.

Pruning neural networks, which involves removing a fraction of their weights, can often maintain high accuracy while significantly reducing model complexity, at least up to a certain limit. We present a neural network pruning technique that builds upon the Combinatorial Brain Surgeon, but solves an optimization problem over a subset of the network weights in an iterative, block-wise manner using block coordinate descent. The iterative, block-based nature of this pruning technique, which we dub ``iterative Combinatorial Brain Surgeon'' (iCBS) allows for scalability to very large models, including large language models (LLMs), that may not be feasible with a one-shot combinatorial optimization approach. When applied to large models like Mistral and DeiT, iCBS achieves higher performance metrics at the same density levels compared to existing pruning methods such as Wanda. This demonstrates the effectiveness of this iterative, block-wise pruning method in compressing and optimizing the performance of large deep learning models, even while optimizing over only a small fraction of the weights. Moreover, our approach allows for a quality-time (or cost) tradeoff that is not available when using a one-shot pruning technique alone. The block-wise formulation of the optimization problem enables the use of hardware accelerators, potentially offsetting the increased computational costs compared to one-shot pruning methods like Wanda. In particular, the optimization problem solved for each block is quantum-amenable in that it could, in principle, be solved by a quantum computer.

16.5LGFeb 3, 2022Code

Graph Coloring with Physics-Inspired Graph Neural Networks

Martin J. A. Schuetz, J. Kyle Brubaker, Zhihuai Zhu et al.

We show how graph neural networks can be used to solve the canonical graph coloring problem. We frame graph coloring as a multi-class node classification problem and utilize an unsupervised training strategy based on the statistical physics Potts model. Generalizations to other multi-class problems such as community detection, data clustering, and the minimum clique cover problem are straightforward. We provide numerical benchmark results and illustrate our approach with an end-to-end application for a real-world scheduling use case within a comprehensive encode-process-decode framework. Our optimization approach performs on par or outperforms existing solvers, with the ability to scale to problems with millions of variables.

29.3LGJul 2, 2021Code

Combinatorial Optimization with Physics-Inspired Graph Neural Networks

Martin J. A. Schuetz, J. Kyle Brubaker, Helmut G. Katzgraber

Combinatorial optimization problems are pervasive across science and industry. Modern deep learning tools are poised to solve these problems at unprecedented scales, but a unifying framework that incorporates insights from statistical physics is still outstanding. Here we demonstrate how graph neural networks can be used to solve combinatorial optimization problems. Our approach is broadly applicable to canonical NP-hard problems in the form of quadratic unconstrained binary optimization problems, such as maximum cut, minimum vertex cover, maximum independent set, as well as Ising spin glasses and higher-order generalizations thereof in the form of polynomial unconstrained binary optimization problems. We apply a relaxation strategy to the problem Hamiltonian to generate a differentiable loss function with which we train the graph neural network and apply a simple projection to integer variables once the unsupervised training process has completed. We showcase our approach with numerical results for the canonical maximum cut and maximum independent set problems. We find that the graph neural network optimizer performs on par or outperforms existing solvers, with the ability to scale beyond the state of the art to problems with millions of variables.