Gianluca Detommaso

ML
h-index3
9papers
270citations
Novelty46%
AI Score31

9 Papers

LGFeb 8, 2023Code
Fortuna: A Library for Uncertainty Quantification in Deep Learning

Gianluca Detommaso, Alberto Gasparin, Michele Donini et al. · amazon-science

We present Fortuna, an open-source library for uncertainty quantification in deep learning. Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reliable uncertainty estimates, and scalable Bayesian inference methods that can be applied to Flax-based deep neural networks trained from scratch for improved uncertainty quantification and accuracy. By providing a coherent framework for advanced uncertainty quantification methods, Fortuna simplifies the process of benchmarking and helps practitioners build robust AI systems.

NAFeb 21, 2018
Continuous Level Monte Carlo and Sample-Adaptive Model Hierarchies

Gianluca Detommaso, Tim Dodwell, Rob Scheichl · amazon-science

In this paper, we present a generalisation of the Multilevel Monte Carlo (MLMC) method to a setting where the level parameter is a continuous variable. This Continuous Level Monte Carlo (CLMC) estimator provides a natural framework in PDE applications to adapt the model hierarchy to each sample. In addition, it can be made unbiased with respect to the expected value of the true quantity of interest provided the quantity of interest converges sufficiently fast. The practical implementation of the CLMC estimator is based on interpolating actual evaluations of the quantity of interest at a finite number of resolutions. As our new level parameter, we use the logarithm of a goal-oriented finite element error estimator for the accuracy of the quantity of interest. We prove the unbiasedness, as well as a complexity theorem that shows the same rate of complexity for CLMC as for MLMC. Finally, we provide some numerical evidence to support our theoretical results, by successfully testing CLMC on a standard PDE test problem. The numerical experiments demonstrate clear gains for sample-wise adaptive refinement strategies over uniform refinements.

MLJul 17, 2022
Uncertainty Calibration in Bayesian Neural Networks via Distance-Aware Priors

Gianluca Detommaso, Alberto Gasparin, Andrew Wilson et al. · amazon-science

As we move away from the data, the predictive uncertainty should increase, since a great variety of explanations are consistent with the little available information. We introduce Distance-Aware Prior (DAP) calibration, a method to correct overconfidence of Bayesian deep learning models outside of the training domain. We define DAPs as prior distributions over the model parameters that depend on the inputs through a measure of their distance from the training set. DAP calibration is agnostic to the posterior inference method, and it can be performed as a post-processing step. We demonstrate its effectiveness against several baselines in a variety of classification and regression problems, including benchmarks designed to test the quality of predictive distributions away from the data.

CLJul 31, 2024
Cost-Effective Hallucination Detection for LLMs

Simon Valentin, Jinmiao Fu, Gianluca Detommaso et al.

Large language models (LLMs) can be prone to hallucinations - generating unreliable outputs that are unfaithful to their inputs, external facts or internally inconsistent. In this work, we address several challenges for post-hoc hallucination detection in production settings. Our pipeline for hallucination detection entails: first, producing a confidence score representing the likelihood that a generated answer is a hallucination; second, calibrating the score conditional on attributes of the inputs and candidate response; finally, performing detection by thresholding the calibrated score. We benchmark a variety of state-of-the-art scoring methods on different datasets, encompassing question answering, fact checking, and summarization tasks. We employ diverse LLMs to ensure a comprehensive assessment of performance. We show that calibrating individual scoring methods is critical for ensuring risk-aware downstream decision making. Based on findings that no individual score performs best in all situations, we propose a multi-scoring framework, which combines different scores and achieves top performance across all datasets. We further introduce cost-effective multi-scoring, which can match or even outperform more expensive detection methods, while significantly reducing computational overhead.

MLApr 6, 2024
Multicalibration for Confidence Scoring in LLMs

Gianluca Detommaso, Martin Bertran, Riccardo Fogliato et al.

This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs). Multicalibration asks for calibration not just marginally, but simultaneously across various intersecting groupings of the data. We show how to form groupings for prompt/completion pairs that are correlated with the probability of correctness via two techniques: clustering within an embedding space, and "self-annotation" - querying the LLM by asking it various yes-or-no questions about the prompt. We also develop novel variants of multicalibration algorithms that offer performance improvements by reducing their tendency to overfit. Through systematic benchmarking across various question answering datasets and LLMs, we show how our techniques can yield confidence scores that provide substantial improvements in fine-grained measures of both calibration and accuracy compared to existing methods.

MEJun 17, 2021
Causal Bias Quantification for Continuous Treatments

Gianluca Detommaso, Michael Brückner, Philip Schulz et al.

We extend the definition of the marginal causal effect to the continuous treatment setting and develop a novel characterization of causal bias in the framework of structural causal models. We prove that our derived bias expression is zero if, and only if, the causal effect is identifiable via covariate adjustment. We show that under some restrictions on the structural equations, the causal bias can be estimated efficiently and allows for causal regularization of predictive probabilistic models. We demonstrate the effectiveness of our method for causal bias quantification in various settings where (not) controlling for certain covariates would introduce causal bias.

MLMay 25, 2019
HINT: Hierarchical Invertible Neural Transport for Density Estimation and Bayesian Inference

Jakob Kruse, Gianluca Detommaso, Ullrich Köthe et al.

Many recent invertible neural architectures are based on coupling block designs where variables are divided in two subsets which serve as inputs of an easily invertible (usually affine) triangular transformation. While such a transformation is invertible, its Jacobian is very sparse and thus may lack expressiveness. This work presents a simple remedy by noting that subdivision and (affine) coupling can be repeated recursively within the resulting subsets, leading to an efficiently invertible block with dense, triangular Jacobian. By formulating our recursive coupling scheme via a hierarchical architecture, HINT allows sampling from a joint distribution p(y,x) and the corresponding posterior p(x|y) using a single invertible network. We evaluate our method on some standard data sets and benchmark its full power for density estimation and Bayesian inference on a novel data set of 2D shapes in Fourier parameterization, which enables consistent visualization of samples for different dimensionalities.

MLJan 23, 2019
Stein Variational Online Changepoint Detection with Applications to Hawkes Processes and Neural Networks

Gianluca Detommaso, Hanne Hoitzing, Tiangang Cui et al.

Bayesian online changepoint detection (BOCPD) (Adams & MacKay, 2007) offers a rigorous and viable way to identify changepoints in complex systems. In this work, we introduce a Stein variational online changepoint detection (SVOCD) method to provide a computationally tractable generalization of BOCPD beyond the exponential family of probability distributions. We integrate the recently developed Stein variational Newton (SVN) method (Detommaso et al., 2018) and BOCPD to offer a full online Bayesian treatment for a large number of situations with significant importance in practice. We apply the resulting method to two challenging and novel applications: Hawkes processes and long short-term memory (LSTM) neural networks. In both cases, we successfully demonstrate the efficacy of our method on real data.

MLJun 8, 2018
A Stein variational Newton method

Gianluca Detommaso, Tiangang Cui, Alessio Spantini et al.

Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.