Gabriele Santin

LG
h-index17
23papers
459citations
Novelty42%
AI Score41

23 Papers

NAFeb 28, 2015
Approximation of Eigenfunctions in Kernel-based Spaces

Gabriele Santin, Robert Schaback

Kernel-based methods in Numerical Analysis have the advantage of yielding optimal recovery processes in the "native" Hilbert space $\calh$ in which they are reproducing. Continuous kernels on compact domains have an expansion into eigenfunctions that are both $L_2$-orthonormal and orthogonal in $\calh$ (Mercer expansion). This paper examines the corresponding eigenspaces and proves that they have optimality properties among all other subspaces of $\calh$. These results have strong connections to $n$-widths in Approximation Theory, and they establish that errors of optimal approximations are closely related to the decay of the eigenvalues. Though the eigenspaces and eigenvalues are not readily available, they can be well approximated using the standard $n$-dimensional subspaces spanned by translates of the kernel with respect to $n$ nodes or centers. We give error bounds for the numerical approximation of the eigensystem via such subspaces. A series of examples shows that our numerical technique via a greedy point selection strategy allows to calculate the eigensystems with good accuracy.

NADec 8, 2016
Convergence rate of the data-independent $P$-greedy algorithm in kernel-based approximation

Gabriele Santin, Bernard Haasdonk

Kernel-based methods provide flexible and accurate algorithms for the reconstruction of functions from meshless samples. A major question in the use of such methods is the influence of the samples locations on the behavior of the approximation, and feasible optimal strategies are not known for general problems. Nevertheless, efficient and greedy point-selection strategies are known. This paper gives a proof of the convergence rate of the data-independent \textit{$P$-greedy} algorithm, based on the application of the convergence theory for greedy algorithms in reduced basis methods. The resulting rate of convergence is shown to be near-optimal in the case of kernels generating Sobolev spaces. As a consequence, this convergence rate proves that, for kernels of Sobolev spaces, the points selected by the algorithm are asymptotically uniformly distributed, as conjectured in the paper where the algorithm has been introduced.

LGFeb 2, 2023
Graph Neural Networks for temporal graphs: State of the art, open challenges, and opportunities

Antonio Longa, Veronica Lachi, Gabriele Santin et al.

Graph Neural Networks (GNNs) have become the leading paradigm for learning on (static) graph-structured data. However, many real-world systems are dynamic in nature, since the graph and node/edge attributes change over time. In recent years, GNN-based models for temporal graphs have emerged as a promising area of research to extend the capabilities of GNNs. In this work, we provide the first comprehensive overview of the current state-of-the-art of temporal GNN, introducing a rigorous formalization of learning settings and tasks and a novel taxonomy categorizing existing approaches in terms of how the temporal aspect is represented and processed. We conclude the survey with a discussion of the most relevant open challenges for the field, from both research and application perspectives.

LGOct 27, 2022
Explaining the Explainers in Graph Neural Networks: a Comparative Study

Antonio Longa, Steve Azzolin, Gabriele Santin et al.

Following a fast initial breakthrough in graph based learning, Graph Neural Networks (GNNs) have reached a widespread application in many science and engineering fields, prompting the need for methods to understand their decision process. GNN explainers have started to emerge in recent years, with a multitude of methods both novel or adapted from other domains. To sort out this plethora of alternative approaches, several studies have benchmarked the performance of different explainers in terms of various explainability metrics. However, these earlier works make no attempts at providing insights into why different GNN architectures are more or less explainable, or which explainer should be preferred in a given setting. In this survey, we fill these gaps by devising a systematic experimental study, which tests ten explainers on eight representative architectures trained on six carefully designed graph and node classification datasets. With our results we provide key insights on the choice and applicability of GNN explainers, we isolate key components that make them usable and successful and provide recommendations on how to avoid common interpretation pitfalls. We conclude by highlighting open questions and directions of possible future research.

NAJul 24, 2018
Interpolation with uncoupled separable matrix-valued kernels

Dominik Wittwar, Gabriele Santin, Bernard Haasdonk

In this paper we consider the problem of approximating vector-valued functions over a domain $Ω$. For this purpose, we use matrix-valued reproducing kernels, which can be related to Reproducing kernel Hilbert spaces of vectorial functions and which can be viewed as an extension to the scalar-valued case. These spaces seem promising, when modelling correlations between the target function components, as the components are not learned independently of one another. We focus on the interpolation with such matrix-valued kernels. We derive error bounds for the interpolation error in terms of a generalized power-function and we introduce a subclass of matrix-valued kernels whose power-functions can be traced back to the power-function of scalar-valued reproducing kernels. Finally, we apply these kind of kernels to some artificial data to illustrate the benefit of interpolation with matrix-valued kernels in comparison to a componentwise approach.

DSOct 30, 2018
Greedy Kernel Methods for Center Manifold Approximation

Bernard Haasdonk, Boumediene Hamzi, Gabriele Santin et al.

For certain dynamical systems it is possible to significantly simplify the study of stability by means of the center manifold theory. This theory allows to isolate the complicated asymptotic behavior of the system close to a non-hyperbolic equilibrium point, and to obtain meaningful predictions of its behavior by analyzing a reduced dimensional problem. Since the manifold is usually not known, approximation methods are of great interest to obtain qualitative estimates. In this work, we use a data-based greedy kernel method to construct a suitable approximation of the manifold close to the equilibrium. The data are collected by repeated numerical simulation of the full system by means of a high-accuracy solver, which generates sets of discrete trajectories that are then used to construct a surrogate model of the manifold. The method is tested on different examples which show promising performance and good accuracy.

NANov 30, 2016
A rescaled method for RBF approximation

Stefano De Marchi, Andrea Idda, Gabriele Santin

In the recent paper [8], a new method to compute stable kernel-based interpolants has been presented. This \textit{rescaled interpolation} method combines the standard kernel interpolation with a properly defined rescaling operation, which smooths the oscillations of the interpolant. Although promising, this procedure lacks a systematic theoretical investigation. Through our analysis, this novel method can be understood as standard kernel interpolation by means of a properly rescaled kernel. This point of view allow us to consider its error and stability properties.

NAJul 25, 2018
Greedy regularized kernel interpolation

Gabriele Santin, Dominik Wittwar, Bernard Haasdonk

Kernel based regularized interpolation is a well known technique to approximate a continuous multivariate function using a set of scattered data points and the corresponding function evaluations, or data values. This method has some advantage over exact interpolation: one can obtain the same approximation order while solving a better conditioned linear system. This method is well suited also for noisy data values, where exact interpolation is not meaningful. Moreover, it allows more flexibility in the kernel choice, since approximation problems can be solved also for non strictly positive definite kernels. We discuss in this paper a greedy algorithm to compute a sparse approximation of the kernel regularized interpolant. This sparsity is a desirable property when the approximant is used as a surrogate of an expensive function, since the resulting model is fast to evaluate. Moreover, we derive convergence results for the approximation scheme, and we prove that a certain greedy selection rule produces asymptotically quasi-optimal error rates.

AISep 15, 2023
Autonomous and Human-Driven Vehicles Interacting in a Roundabout: A Quantitative and Qualitative Evaluation

Laura Ferrarotti, Massimiliano Luca, Gabriele Santin et al.

Optimizing traffic dynamics in an evolving transportation landscape is crucial, particularly in scenarios where autonomous vehicles (AVs) with varying levels of autonomy coexist with human-driven cars. While optimizing Reinforcement Learning (RL) policies for such scenarios is becoming more and more common, little has been said about realistic evaluations of such trained policies. This paper presents an evaluation of the effects of AVs penetration among human drivers in a roundabout scenario, considering both quantitative and qualitative aspects. In particular, we learn a policy to minimize traffic jams (i.e., minimize the time to cross the scenario) and to minimize pollution in a roundabout in Milan, Italy. Through empirical analysis, we demonstrate that the presence of AVs} can reduce time and pollution levels. Furthermore, we qualitatively evaluate the learned policy using a cutting-edge cockpit to assess its performance in near-real-world conditions. To gauge the practicality and acceptability of the policy, we conduct evaluations with human participants using the simulator, focusing on a range of metrics like traffic smoothness and safety perception. In general, our findings show that human-driven vehicles benefit from optimizing AVs dynamics. Also, participants in the study highlight that the scenario with 80% AVs is perceived as safer than the scenario with 20%. The same result is obtained for traffic smoothness perception.

NAOct 5, 2012
A new stable basis for RBF approximation

Gabriele Santin

It's well know that Radial Basis Function approximants suffers of bad conditioning if the simple basis of translates is used. A recent work of M.Pazouki and R.Schaback gives a quite general way to build stable, orthonormal bases for the native space based on a factorization of the kernel matrix A. Starting from that setting we describe a particular orthonormal basis that arises from a weighted singular value decomposition of A. This basis is related to a discretization of the compact operator which leads to the so-called eigenbasis, and provides a connection with it. We give convergence estimates and stability bound for the interpolation and the discrete least-squares approximation based on this basis, which involves the eigenvalues of such an operator.

NADec 12, 2015
Approximating basins of attraction for dynamical systems via stable radial bases

Roberto Cavoretto, Stefano De Marchi, Alessandra De Rossi et al.

In applied sciences, such as physics and biology, it is often required to model the evolution of populations via dynamical systems. In this paper, we focus on the problem of approximating the basins of attraction of such models in case of multi-stability. We propose to reconstruct the domains of attraction via an implicit interpolant using stable radial bases, obtaining the surfaces by partitioning the phase space into disjoint regions. An application to a competition model presenting jointly three stable equilibria is considered.

LGMar 15, 2022
A Framework for Verifiable and Auditable Federated Anomaly Detection

Gabriele Santin, Inna Skarbovsky, Fabiana Fournier et al.

Federated Leaning is an emerging approach to manage cooperation between a group of agents for the solution of Machine Learning tasks, with the goal of improving each agent's performance without disclosing any data. In this paper we present a novel algorithmic architecture that tackle this problem in the particular case of Anomaly Detection (or classification or rare events), a setting where typical applications often comprise data with sensible information, but where the scarcity of anomalous examples encourages collaboration. We show how Random Forests can be used as a tool for the development of accurate classifiers with an effective insight-sharing mechanism that does not break the data integrity. Moreover, we explain how the new architecture can be readily integrated in a blockchain infrastructure to ensure the verifiable and auditable execution of the algorithm. Furthermore, we discuss how this work may set the basis for a more general approach for the design of federated ensemble-learning methods beyond the specific task and architecture discussed in this paper.

LGMar 11, 2022
Reprogramming FairGANs with Variational Auto-Encoders: A New Transfer Learning Model

Beatrice Nobile, Gabriele Santin, Bruno Lepri et al.

Fairness-aware GANs (FairGANs) exploit the mechanisms of Generative Adversarial Networks (GANs) to impose fairness on the generated data, freeing them from both disparate impact and disparate treatment. Given the model's advantages and performance, we introduce a novel learning framework to transfer a pre-trained FairGAN to other tasks. This reprogramming process has the goal of maintaining the FairGAN's main targets of data utility, classification utility, and data fairness, while widening its applicability and ease of use. In this paper we present the technical extensions required to adapt the original architecture to this new framework (and in particular the use of Variational Auto-Encoders), and discuss the benefits, trade-offs, and limitations of the new model.

NADec 15, 2022
Interpolation with the polynomial kernels

Giacomo Elefante, Wolfgang Erb, Francesco Marchetti et al.

The polynomial kernels are widely used in machine learning and they are one of the default choices to develop kernel-based classification and regression models. However, they are rarely used and considered in numerical analysis due to their lack of strict positive definiteness. In particular they do not enjoy the usual property of unisolvency for arbitrary point sets, which is one of the key properties used to build kernel-based interpolation methods. This paper is devoted to establish some initial results for the study of these kernels, and their related interpolation algorithms, in the context of approximation theory. We will first prove necessary and sufficient conditions on point sets which guarantee the existence and uniqueness of an interpolant. We will then study the Reproducing Kernel Hilbert Spaces (or native spaces) of these kernels and their norms, and provide inclusion relations between spaces corresponding to different kernel parameters. With these spaces at hand, it will be further possible to derive generic error estimates which apply to sufficiently smooth functions, thus escaping the native space. Finally, we will show how to employ an efficient stable algorithm to these kernels to obtain accurate interpolants, and we will test them in some numerical experiment. After this analysis several computational and theoretical aspects remain open, and we will outline possible further research directions in a concluding section. This work builds some bridges between kernel and polynomial interpolation, two topics to which the authors, to different extents, have been introduced under the supervision or through the work of Stefano De Marchi. For this reason, they wish to dedicate this work to him in the occasion of his 60th birthday.

11.7LGMay 4
Differentiable Kernel Ridge Regression for Deep Learning Pipelines

Jean-Marc Mercier, Gabriele Santin

Deep neural networks dominate modern machine learning, while alternative function approximators remain comparatively underexplored at scale. In this work, we revisit kernel methods as drop-in components for standard deep learning pipelines. We introduce \emph{Sparse Kernels} (SKs), a differentiable, localized, and lazy variant of kernel ridge regression (KRR) that defers training to inference time and reduces to the solution of small local systems. We integrate SKs into PyTorch as modular layers that preserve end-to-end trainability, and we show that they expose three distinct sets of parameters -- feature representations, target values, and evaluation points -- each of which can be fixed or learned. This decomposition broadens the design space available to practitioners, enabling, in particular, training-free transfer, nonlinear probing, and hybrid kernel-neural models. Across convolutional networks, vision transformers, and reinforcement learning, SK-based modules serve two complementary roles: in some settings, they match the performance of trained neural readouts with substantially less training; in others, they augment existing models and improve their performance when used as additional components. Our results suggest that kernel methods, once made scalable and differentiable, can be readily integrated with deep learning rather than treated as a separate paradigm.

LGJan 17, 2024
A Characterization Theorem for Equivariant Networks with Point-wise Activations

Marco Pacini, Xiaowen Dong, Bruno Lepri et al.

Equivariant neural networks have shown improved performance, expressiveness and sample complexity on symmetrical domains. But for some specific symmetries, representations, and choice of coordinates, the most common point-wise activations, such as ReLU, are not equivariant, hence they cannot be employed in the design of equivariant neural networks. The theorem we present in this paper describes all possible combinations of finite-dimensional representations, choice of coordinates and point-wise activations to obtain an exactly equivariant layer, generalizing and strengthening existing characterizations. Notable cases of practical relevance are discussed as corollaries. Indeed, we prove that rotation-equivariant networks can only be invariant, as it happens for any network which is equivariant with respect to connected compact groups. Then, we discuss implications of our findings when applied to important instances of exactly equivariant networks. First, we completely characterize permutation equivariant networks such as Invariant Graph Networks with point-wise nonlinearities and their geometric counterparts, highlighting a plethora of models whose expressive power and performance are still unknown. Second, we show that feature spaces of disentangled steerable convolutional neural networks are trivial representations.

LGJun 2, 2025
On Universality Classes of Equivariant Networks

Marco Pacini, Gabriele Santin, Bruno Lepri et al.

Equivariant neural networks provide a principled framework for incorporating symmetry into learning architectures and have been extensively analyzed through the lens of their separation power, that is, the ability to distinguish inputs modulo symmetry. This notion plays a central role in settings such as graph learning, where it is often formalized via the Weisfeiler-Leman hierarchy. In contrast, the universality of equivariant models-their capacity to approximate target functions-remains comparatively underexplored. In this work, we investigate the approximation power of equivariant neural networks beyond separation constraints. We show that separation power does not fully capture expressivity: models with identical separation power may differ in their approximation ability. To demonstrate this, we characterize the universality classes of shallow invariant networks, providing a general framework for understanding which functions these architectures can approximate. Since equivariant models reduce to invariant ones under projection, this analysis yields sufficient conditions under which shallow equivariant networks fail to be universal. Conversely, we identify settings where shallow models do achieve separation-constrained universality. These positive results, however, depend critically on structural properties of the symmetry group, such as the existence of adequate normal subgroups, which may not hold in important cases like permutation symmetry.

LGJun 13, 2024
Separation Power of Equivariant Neural Networks

Marco Pacini, Xiaowen Dong, Bruno Lepri et al.

The separation power of a machine learning model refers to its ability to distinguish between different inputs and is often used as a proxy for its expressivity. Indeed, knowing the separation power of a family of models is a necessary condition to obtain fine-grained universality results. In this paper, we analyze the separation power of equivariant neural networks, such as convolutional and permutation-invariant networks. We first present a complete characterization of inputs indistinguishable by models derived by a given architecture. From this results, we derive how separability is influenced by hyperparameters and architectural choices-such as activation functions, depth, hidden layer width, and representation types. Notably, all non-polynomial activations, including ReLU and sigmoid, are equivalent in expressivity and reach maximum separation power. Depth improves separation power up to a threshold, after which further increases have no effect. Adding invariant features to hidden representations does not impact separation power. Finally, block decomposition of hidden representations affects separability, with minimal components forming a hierarchy in separation power that provides a straightforward method for comparing the separation power of models.

LGMay 15, 2021
Analysis of Structured Deep Kernel Networks

Tizian Wenzel, Gabriele Santin, Bernard Haasdonk

In this paper, we leverage a recent deep kernel representer theorem to connect kernel based learning and (deep) neural networks in order to understand their interplay. In particular, we show that the use of special types of kernels yields models reminiscent of neural networks that are founded in the same theoretical framework of classical kernel methods, while benefiting from the computational advantages of deep neural networks. Especially the introduced Structured Deep Kernel Networks (SDKNs) can be viewed as neural networks (NNs) with optimizable activation functions obeying a representer theorem. This link allows us to analyze also NNs within the framework of kernel networks. We prove analytic properties of the SDKNs which show their universal approximation properties in three different asymptotic regimes of unbounded number of centers, width and depth. Especially in the case of unbounded depth, more accurate constructions can be achieved using fewer layers compared to corresponding constructions for ReLU neural networks. This is made possible by leveraging properties of kernel approximation.

LGMar 25, 2021
Structured Deep Kernel Networks for Data-Driven Closure Terms of Turbulent Flows

Tizian Wenzel, Marius Kurz, Andrea Beck et al.

Standard kernel methods for machine learning usually struggle when dealing with large datasets. We review a recently introduced Structured Deep Kernel Network (SDKN) approach that is capable of dealing with high-dimensional and huge datasets - and enjoys typical standard machine learning approximation properties. We extend the SDKN to combine it with standard machine learning modules and compare it with Neural Networks on the scientific challenge of data-driven prediction of closure terms of turbulent flows. We show experimentally that the SDKNs are capable of dealing with large datasets and achieve near-perfect accuracy on the given application.

LGMar 2, 2021
Kernel-Based Models for Influence Maximization on Graphs based on Gaussian Process Variance Minimization

Salvatore Cuomo, Wolfgang Erb, Gabriele Santin

The inference of novel knowledge, the discovery of hidden patterns, and the uncovering of insights from large amounts of data from a multitude of sources make Data Science (DS) to an art rather than just a mere scientific discipline. The study and design of mathematical models able to analyze information represents a central research topic in DS. In this work, we introduce and investigate a novel model for influence maximization (IM) on graphs using ideas from kernel-based approximation, Gaussian process regression, and the minimization of a corresponding variance term. Data-driven approaches can be applied to determine proper kernels for this IM model and machine learning methodologies are adopted to tune the model parameters. Compared to stochastic models in this field that rely on costly Monte-Carlo simulations, our model allows for a simple and cost-efficient update strategy to compute optimal influencing nodes on a graph. In several numerical experiments, we show the properties and benefits of this new model.

NADec 1, 2020
Kernel methods for center manifold approximation and a data-based version of the Center Manifold Theorem

Bernard Haasdonk, Boumediene Hamzi, Gabriele Santin et al.

For dynamical systems with a non hyperbolic equilibrium, it is possible to significantly simplify the study of stability by means of the center manifold theory. This theory allows to isolate the complicated asymptotic behavior of the system close to the equilibrium point and to obtain meaningful predictions of its behavior by analyzing a reduced order system on the so-called center manifold. Since the center manifold is usually not known, good approximation methods are important as the center manifold theorem states that the stability properties of the origin of the reduced order system are the same as those of the origin of the full order system. In this work, we establish a data-based version of the center manifold theorem that works by considering an approximation in place of an exact manifold. Also the error between the approximated and the original reduced dynamics are quantified. We then use an apposite data-based kernel method to construct a suitable approximation of the manifold close to the equilibrium, which is compatible with our general error theory. The data are collected by repeated numerical simulation of the full system by means of a high-accuracy solver, which generates sets of discrete trajectories that are then used as a training set. The method is tested on different examples which show promising performance and good accuracy.

NAApr 27, 2020
Biomechanical surrogate modelling using stabilized vectorial greedy kernel methods

Bernard Haasdonk, Tizian Wenzel, Gabriele Santin et al.

Greedy kernel approximation algorithms are successful techniques for sparse and accurate data-based modelling and function approximation. Based on a recent idea of stabilization of such algorithms in the scalar output case, we here consider the vectorial extension built on VKOGA. We introduce the so called $γ$-restricted VKOGA, comment on analytical properties and present numerical evaluation on data from a clinically relevant application, the modelling of the human spine. The experiments show that the new stabilized algorithms result in improved accuracy and stability over the non-stabilized algorithms.