Federico Ricci-Tersenghi

DIS-NN
h-index4
16papers
236citations
Novelty52%
AI Score51

16 Papers

LGJun 27, 2022
Modern graph neural networks do worse than classical greedy algorithms in solving combinatorial optimization problems like maximum independent set

Maria Chiara Angelini, Federico Ricci-Tersenghi

The recent work ``Combinatorial Optimization with Physics-Inspired Graph Neural Networks'' [Nat Mach Intell 4 (2022) 367] introduces a physics-inspired unsupervised Graph Neural Network (GNN) to solve combinatorial optimization problems on sparse graphs. To test the performances of these GNNs, the authors of the work show numerical results for two fundamental problems: maximum cut and maximum independent set (MIS). They conclude that "the graph neural network optimizer performs on par or outperforms existing solvers, with the ability to scale beyond the state of the art to problems with millions of variables." In this comment, we show that a simple greedy algorithm, running in almost linear time, can find solutions for the MIS problem of much better quality than the GNN. The greedy algorithm is faster by a factor of $10^4$ with respect to the GNN for problems with a million variables. We do not see any good reason for solving the MIS with these GNN, as well as for using a sledgehammer to crack nuts. In general, many claims of superiority of neural networks in solving combinatorial problems are at risk of being not solid enough, since we lack standard benchmarks based on really hard problems. We propose one of such hard benchmarks, and we hope to see future neural network optimizers tested on these problems before any claim of superiority is made.

LGAug 2, 2024
Efficient Graph Coloring with Neural Networks: A Physics-Inspired Approach for Large Graphs

Lorenzo Colantonio, Andrea Cacioppo, Federico Scarpati et al.

Combinatorial optimization problems near algorithmic phase transitions represent a fundamental challenge for both classical algorithms and machine learning approaches. Among them, graph coloring stands as a prototypical constraint satisfaction problem exhibiting sharp dynamical and satisfiability thresholds. Here we introduce a physics-inspired neural framework that learns to solve large-scale graph coloring instances by combining graph neural networks with statistical-mechanics principles. Our approach integrates a planting-based supervised signal, symmetry-breaking regularization, and iterative noise-annealed neural dynamics to navigate clustered solution landscapes. When the number of iterations scales quadratically with graph size, the learned solver reaches algorithmic thresholds close to the theoretical dynamical transition in random graphs and achieves near-optimal detection performance in the planted inference regime. The model generalizes from small training graphs to instances orders of magnitude larger, demonstrating that neural architectures can learn scalable algorithmic strategies that remain effective in hard connectivity regions. These results establish a general paradigm for learning neural solvers that operate near fundamental phase boundaries in combinatorial optimization and inference.

DIS-NNSep 11, 2023
Stochastic Gradient Descent-like relaxation is equivalent to Metropolis dynamics in discrete optimization and inference problems

Maria Chiara Angelini, Angelo Giorgio Cavaliere, Raffaele Marino et al.

Is Stochastic Gradient Descent (SGD) substantially different from Metropolis Monte Carlo dynamics? This is a fundamental question at the time of understanding the most used training algorithm in the field of Machine Learning, but it received no answer until now. Here we show that in discrete optimization and inference problems, the dynamics of an SGD-like algorithm resemble very closely that of Metropolis Monte Carlo with a properly chosen temperature, which depends on the mini-batch size. This quantitative matching holds both at equilibrium and in the out-of-equilibrium regime, despite the two algorithms having fundamental differences (e.g.\ SGD does not satisfy detailed balance). Such equivalence allows us to use results about performances and limits of Monte Carlo algorithms to optimize the mini-batch size in the SGD-like algorithm and make it efficient at recovering the signal in hard inference problems.

67.6DIS-NNApr 7
DYNAMITE: A high-performance framework for solving Dynamical Mean-Field Equations

Johannes Lang, Vincenzo Citro, Luca Leuzzi et al.

Understanding the dynamics of systems evolving in complex and rugged energy landscapes is central across physics, economics, biology, and computer science. Disordered mean-field models provide a powerful framework, as exact Dynamical Mean-Field Equations (DMFE) can be derived. However, solving the DMFE -- a set of coupled integral-differential equations for two-time functions -- remains a major numerical challenge. So far, large-time solutions of DMFE rely either on analytical approaches, such as the Cugliandolo--Kurchan ansatz based on assumptions like weak ergodicity breaking (which is known to fail in some cases), or on numerical integrations that reliably reach times $O(10^3)$ and extend further only via poorly controlled approximations. Consequently, no general method currently exists to solve DMFE at very long times, limiting the study of slow dynamics in complex landscapes. We present \textsc{Dynamite} (DYNAmical Mean-fIeld Time Evolution solver), a high-performance framework for solving DMFE up to unprecedented times $t=O(10^7)$. It combines non-uniform interpolation, adaptive time stepping, and numerical `renormalization' of memory, enabling accurate evaluation of history integrals. Its asymptotic runtime is linear, with sublinear memory scaling. Stability and precision are ensured via an adaptive Runge--Kutta scheme and periodic sparsification of the past. \textsc{Dynamite} achieves orders-of-magnitude speedups over uniform-grid methods while maintaining accuracy and reproducibility on CPU and GPU architectures. Benchmarks on glassy mean-field models, including the mixed spherical $p$-spin system, demonstrate access to aging and relaxation regimes previously out of reach. The framework provides a reproducible and extensible foundation for studying long-memory dynamical systems.

DIS-NNFeb 20Code
Benchmarking Graph Neural Networks in Solving Hard Constraint Satisfaction Problems

Geri Skenderi, Lorenzo Buffoni, Francesco D'Amico et al.

Graph neural networks (GNNs) are increasingly applied to hard optimization problems, often claiming superiority over classical heuristics. However, such claims risk being unsolid due to a lack of standard benchmarks on truly hard instances. From a statistical physics perspective, we propose new hard benchmarks based on random problems. We provide these benchmarks, along with performance results from both classical heuristics and GNNs. Our fair comparison shows that classical algorithms still outperform GNNs. We discuss the challenges for neural networks in this domain. Future claims of superiority can be made more robust using our benchmarks, available at https://github.com/ArtLabBocconi/RandCSPBench.

27.3ITApr 15
Phase transition in compressed sensing using log-sum penalty and adaptive smoothing

Keisuke Morita, Federico Ricci-Tersenghi, Masayuki Ohzeki

In many real-world problems, recovering sparse signals from underdetermined linear systems remains a fundamental challenge. Although $\ell_1$ norm minimization is widely used, it suffers from estimation bias that prevents it from reaching the Bayes-optimal reconstruction limit. Nonconvex alternatives, such as the log-sum penalty, have been proposed to promote stronger sparsity. However, maintaining their algorithmic stability is challenging. To address this challenge, we introduce an adaptive smoothing strategy within an approximate message passing framework to mitigate algorithmic instability. Furthermore, we evaluate the typical exact-recovery threshold for Gaussian measurement matrices using the replica method and state evolution. The results indicate that the adaptive method achieves exact recovery over a broader region than $\ell_1$ norm minimization, although metastable states hinder reaching the information-theoretic limit.

DIS-NNMay 28, 2025
Performance of machine-learning-assisted Monte Carlo in sampling from simple statistical physics models

Luca Maria Del Bono, Federico Ricci-Tersenghi, Francesco Zamponi

Recent years have seen a rise in the application of machine learning techniques to aid the simulation of hard-to-sample systems that cannot be studied using traditional methods. Despite the introduction of many different architectures and procedures, a wide theoretical understanding is still lacking, with the risk of suboptimal implementations. As a first step to address this gap, we provide here a complete analytic study of the widely-used Sequential Tempering procedure applied to a shallow MADE architecture for the Curie-Weiss model. The contribution of this work is twofold: firstly, we give a description of the optimal weights and of the training under Gradient Descent optimization. Secondly, we compare what happens in Sequential Tempering with and without the addition of local Metropolis Monte Carlo steps. We are thus able to give theoretical predictions on the best procedure to apply in this case. This work establishes a clear theoretical basis for the integration of machine learning techniques into Monte Carlo sampling and optimization.

DIS-NNFeb 18, 2025
Evidence of Replica Symmetry Breaking under the Nishimori conditions in epidemic inference on graphs

Alfredo Braunstein, Louise Budzynski, Matteo Mariani et al.

In Bayesian inference, computing the posterior distribution from the data is typically a non-trivial problem, which usually requires approximations such as mean-field approaches or numerical methods, like the Monte Carlo Markov Chain. Being a high-dimensional distribution over a set of correlated variables, the posterior distribution can undergo the notorious replica symmetry breaking transition. When it happens, several mean-field methods and virtually every Monte Carlo scheme can not provide a reasonable approximation to the posterior and its marginals. Replica symmetry is believed to be guaranteed whenever the data is generated with known prior and likelihood distributions, namely under the so-called Nishimori conditions. In this paper, we break this belief, by providing a counter-example showing that, under the Nishimori conditions, replica symmetry breaking arises. Introducing a simple, geometrical model that can be thought of as a patient zero retrieval problem in a highly infectious regime of the epidemic Susceptible-Infectious model, we show that under the Nishimori conditions, there is evidence of replica symmetry breaking. We achieve this result by computing the instability of the replica symmetric cavity method toward the one step replica symmetry broken phase. The origin of this phenomenon -- replica symmetry breaking under the Nishimori conditions -- is likely due to the correlated disorder appearing in the epidemic models.

DIS-NNOct 22, 2025
Demonstrating Real Advantage of Machine-Learning-Enhanced Monte Carlo for Combinatorial Optimization

Luca Maria Del Bono, Federico Ricci-Tersenghi, Francesco Zamponi

Combinatorial optimization problems are central to both practical applications and the development of optimization methods. While classical and quantum algorithms have been refined over decades, machine learning-assisted approaches are comparatively recent and have not yet consistently outperformed simple, state-of-the-art classical methods. Here, we focus on a class of Quadratic Unconstrained Binary Optimization (QUBO) problems, specifically the challenge of finding minimum energy configurations in three-dimensional Ising spin glasses. We use a Global Annealing Monte Carlo algorithm that integrates standard local moves with global moves proposed via machine learning. We show that local moves play a crucial role in achieving optimal performance. Benchmarking against Simulated Annealing and Population Annealing, we demonstrate that Global Annealing not only surpasses the performance of Simulated Annealing but also exhibits greater robustness than Population Annealing, maintaining effectiveness across problem hardness and system size without hyperparameter tuning. These results provide, to our knowledge, the first clear and robust evidence that a machine learning-assisted optimization method can exceed the capabilities of classical state-of-the-art techniques in a combinatorial optimization setting.

DIS-NNMay 10, 2023
Phase transitions in the mini-batch size for sparse and dense two-layer neural networks

Raffaele Marino, Federico Ricci-Tersenghi

The use of mini-batches of data in training artificial neural networks is nowadays very common. Despite its broad usage, theories explaining quantitatively how large or small the optimal mini-batch size should be are missing. This work presents a systematic attempt at understanding the role of the mini-batch size in training two-layer neural networks. Working in the teacher-student scenario, with a sparse teacher, and focusing on tasks of different complexity, we quantify the effects of changing the mini-batch size $m$. We find that often the generalization performances of the student strongly depend on $m$ and may undergo sharp phase transitions at a critical value $m_c$, such that for $m<m_c$ the training process fails, while for $m>m_c$ the student learns perfectly or generalizes very well the teacher. Phase transitions are induced by collective phenomena firstly discovered in statistical mechanics and later observed in many fields of science. Observing a phase transition by varying the mini-batch size across different architectures raises several questions about the role of this hyperparameter in the neural network learning process.

DIS-NNNov 26, 2021
Nonequilibrium Monte Carlo for unfreezing variables in hard combinatorial optimization

Masoud Mohseni, Daniel Eppens, Johan Strumpfer et al.

Optimizing highly complex cost/energy functions over discrete variables is at the heart of many open problems across different scientific disciplines and industries. A major obstacle is the emergence of many-body effects among certain subsets of variables in hard instances leading to critical slowing down or collective freezing for known stochastic local search strategies. An exponential computational effort is generally required to unfreeze such variables and explore other unseen regions of the configuration space. Here, we introduce a quantum-inspired family of nonlocal Nonequilibrium Monte Carlo (NMC) algorithms by developing an adaptive gradient-free strategy that can efficiently learn key instance-wise geometrical features of the cost function. That information is employed on-the-fly to construct spatially inhomogeneous thermal fluctuations for collectively unfreezing variables at various length scales, circumventing costly exploration versus exploitation trade-offs. We apply our algorithm to two of the most challenging combinatorial optimization problems: random k-satisfiability (k-SAT) near the computational phase transitions and Quadratic Assignment Problems (QAP). We observe significant speedup and robustness over both specialized deterministic solvers and generic stochastic solvers. In particular, for 90% of random 4-SAT instances we find solutions that are inaccessible for the best specialized deterministic algorithm known as Survey Propagation (SP) with an order of magnitude improvement in the quality of solutions for the hardest 10% instances. We also demonstrate two orders of magnitude improvement in time-to-solution over the state-of-the-art generic stochastic solver known as Adaptive Parallel Tempering (APT).

MLMay 29, 2019
How to iron out rough landscapes and get optimal performances: Averaged Gradient Descent and its application to tensor PCA

Giulio Biroli, Chiara Cammarota, Federico Ricci-Tersenghi

In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated. A standard solution to flatten the corresponding rough landscape consists in summing the losses associated to different data points and obtain a smoother empirical risk. Here we propose a complementary method that works for a single data point. The main idea is that a large amount of the roughness is uncorrelated in different parts of the landscape. One can then substantially reduce the noise by evaluating an empirical average of the gradient obtained as a sum over many random independent positions in the space of parameters to be optimized. We present an algorithm, called Averaged Gradient Descent, based on this idea and we apply it to tensor PCA, which is a very hard estimation problem. We show that Averaged Gradient Descent over-performs physical algorithms such as gradient descent and approximate message passing and matches the best algorithmic thresholds known so far, obtained by tensor unfolding and methods based on sum-of-squares.

CVFeb 16, 2018
SpaRTA - Tracking across occlusions via global partitioning of 3D clouds of points

Andrea Cavagna, Stefania Melillo, Leonardo Parisi et al.

Any 3D tracking algorithm has to deal with occlusions: multiple targets get so close to each other that the loss of their identities becomes likely. In the best case scenario, trajectories are interrupted, thus curbing the completeness of the data-set; in the worse case scenario, identity switches arise, potentially affecting in severe ways the very quality of the data. Here, we present a novel tracking method that addresses the problem of occlusions within large groups of featureless objects by means of three steps: i) it represents each target as a cloud of points in 3D; ii) once a 3D cluster corresponding to an occlusion occurs, it defines a partitioning problem by introducing a cost function that uses both attractive and repulsive spatio-temporal proximity links; iii) it minimizes the cost function through a semi-definite optimization technique specifically designed to cope with link frustration. The algorithm is independent of the specific experimental method used to collect the data. By performing tests on public data-sets, we show that the new algorithm produces a significant improvement over the state-of-the-art tracking methods, both by reducing the number of identity switches and by increasing the accuracy of the actual positions of the targets in real space.

MLNov 2, 2016
Improving variational methods via pairwise linear response identities

Jack Raymond, Federico Ricci-Tersenghi

Inference methods are often formulated as variational approximations: these approximations allow easy evaluation of statistics by marginalization or linear response, but these estimates can be inconsistent. We show that by introducing constraints on covariance, one can ensure consistency of linear response with the variational parameters, and in so doing inference of marginal probability distributions is improved. For the Bethe approximation and its generalizations, improvements are achieved with simple choices of the constraints. The approximations are presented as variational frameworks; iterative procedures related to message passing are provided for finding the minima.

MLMar 30, 2016
Performance of a community detection algorithm based on semidefinite programming

Adel Javanmard, Andrea Montanari, Federico Ricci-Tersenghi

The problem of detecting communities in a graph is maybe one the most studied inference problems, given its simplicity and widespread diffusion among several disciplines. A very common benchmark for this problem is the stochastic block model or planted partition problem, where a phase transition takes place in the detection of the planted partition by changing the signal-to-noise ratio. Optimal algorithms for the detection exist which are based on spectral methods, but we show these are extremely sensible to slight modification in the generative model. Recently Javanmard, Montanari and Ricci-Tersenghi (arXiv:1511.08769) have used statistical physics arguments, and numerical simulations to show that finding communities in the stochastic block model via semidefinite programming is quasi optimal. Further, the resulting semidefinite relaxation can be solved efficiently, and is very robust with respect to changes in the generative model. In this paper we study in detail several practical aspects of this new algorithm based on semidefinite programming for the detection of the planted partition. The algorithm turns out to be very fast, allowing the solution of problems with $O(10^5)$ variables in few second on a laptop computer.

CCAug 20, 2015
The backtracking survey propagation algorithm for solving random K-SAT problems

Raffaele Marino, Giorgio Parisi, Federico Ricci-Tersenghi

Discrete combinatorial optimization has a central role in many scientific disciplines, however, for hard problems we lack linear time algorithms that would allow us to solve very large instances. Moreover, it is still unclear what are the key features that make a discrete combinatorial optimization problem hard to solve. Here we study random K-satisfiability problems with $K=3,4$, which are known to be very hard close to the SAT-UNSAT threshold, where problems stop having solutions. We show that the backtracking survey propagation algorithm, in a time practically linear in the problem size, is able to find solutions very close to the threshold, in a region unreachable by any other algorithm. All solutions found have no frozen variables, thus supporting the conjecture that only unfrozen solutions can be found in linear time, and that a problem becomes impossible to solve in linear time when all solutions contain frozen variables.