CVNov 23, 2022Code
BiasBed -- Rigorous Texture Bias EvaluationNikolai Kalischek, Rodrigo C. Daudt, Torben Peters et al.
The well-documented presence of texture bias in modern convolutional neural networks has led to a plethora of algorithms that promote an emphasis on shape cues, often to support generalization to new domains. Yet, common datasets, benchmarks and general model selection strategies are missing, and there is no agreed, rigorous evaluation protocol. In this paper, we investigate difficulties and limitations when training networks with reduced texture bias. In particular, we also show that proper evaluation and meaningful comparisons between methods are not trivial. We introduce BiasBed, a testbed for texture- and style-biased training, including multiple datasets and a range of existing algorithms. It comes with an extensive evaluation protocol that includes rigorous hypothesis testing to gauge the significance of the results, despite the considerable training instability of some style bias methods. Our extensive experiments, shed new light on the need for careful, statistically founded evaluation protocols for style bias (and beyond). E.g., we find that some algorithms proposed in the literature do not significantly mitigate the impact of style bias at all. With the release of BiasBed, we hope to foster a common understanding of consistent and meaningful comparisons, and consequently faster progress towards learning methods free of texture bias. Code is available at https://github.com/D1noFuzi/BiasBed
MLJul 7, 2025Code
Vecchia-Inducing-Points Full-Scale Approximations for Gaussian ProcessesTim Gyger, Reinhard Furrer, Fabio Sigrist
Gaussian processes are flexible, probabilistic, non-parametric models widely used in machine learning and statistics. However, their scalability to large data sets is limited by computational constraints. To overcome these challenges, we propose Vecchia-inducing-points full-scale (VIF) approximations combining the strengths of global inducing points and local Vecchia approximations. Vecchia approximations excel in settings with low-dimensional inputs and moderately smooth covariance functions, while inducing point methods are better suited to high-dimensional inputs and smoother covariance functions. Our VIF approach bridges these two regimes by using an efficient correlation-based neighbor-finding strategy for the Vecchia approximation of the residual process, implemented via a modified cover tree algorithm. We further extend our framework to non-Gaussian likelihoods by introducing iterative methods that substantially reduce computational costs for training and prediction by several orders of magnitudes compared to Cholesky-based computations when using a Laplace approximation. In particular, we propose and compare novel preconditioners and provide theoretical convergence results. Extensive numerical experiments on simulated and real-world data sets show that VIF approximations are both computationally efficient as well as more accurate and numerically stable than state-of-the-art alternatives. All methods are implemented in the open source C++ library GPBoost with high-level Python and R interfaces.
MEMay 23, 2024
Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial DataTim Gyger, Reinhard Furrer, Fabio Sigrist
Gaussian processes are flexible probabilistic regression models which are widely used in statistics and machine learning. However, a drawback is their limited scalability to large data sets. To alleviate this, full-scale approximations (FSAs) combine predictive process methods and covariance tapering, thus approximating both global and local structures. We show how iterative methods can be used to reduce computational costs in calculating likelihoods, gradients, and predictive distributions with FSAs. In particular, we introduce a novel preconditioner and show theoretically and empirically that it accelerates the conjugate gradient method's convergence speed and mitigates its sensitivity with respect to the FSA parameters and the eigenvalue structure of the original covariance matrix, and we demonstrate empirically that it outperforms a state-of-the-art pivoted Cholesky preconditioner. Furthermore, we introduce an accurate and fast way to calculate predictive variances using stochastic simulation and iterative methods. In addition, we show how our newly proposed FITC preconditioner can also be used in iterative methods for Vecchia approximations. In our experiments, it outperforms existing state-of-the-art preconditioners for Vecchia approximations. All methods are implemented in a free C++ software library with high-level Python and R packages.
MLNov 20, 2019
Additive Bayesian Network Modelling with the R Package abnGilles Kratzer, Fraser Iain Lewis, Arianna Comin et al.
The R package abn is designed to fit additive Bayesian models to observational datasets. It contains routines to score Bayesian networks based on Bayesian or information theoretic formulations of generalized linear models. It is equipped with exact search and greedy search algorithms to select the best network. It supports a possible blend of continuous, discrete and count data and input of prior knowledge at a structural level. The Bayesian implementation supports random effects to control for one-layer clustering. In this paper, we give an overview of the methodology and illustrate the package's functionalities using a veterinary dataset about respiratory diseases in commercial swine production.
COFeb 18, 2019
Is a single unique Bayesian network enough to accurately represent your data?Gilles Kratzer, Reinhard Furrer
Bayesian network (BN) modelling is extensively used in systems epidemiology. Usually it consists in selecting and reporting the best-fitting structure conditional to the data. A major practical concern is avoiding overfitting, on account of its extreme flexibility and its modelling richness. Many approaches have been proposed to control for overfitting. Unfortunately, they essentially all rely on very crude decisions that result in too simplistic approaches for such complex systems. In practice, with limited data sampled from complex system, this approach seems too simplistic. An alternative would be to use the Monte Carlo Markov chain model choice (MC3) over the network to learn the landscape of reasonably supported networks, and then to present all possible arcs with their MCMC support. This paper presents an R implementation, called mcmcabn, of a flexible structural MC3 that is accessible to non-specialists.
MESep 18, 2018
Comparison between Suitable Priors for Additive Bayesian NetworksGilles Kratzer, Reinhard Furrer, Marta Pittavino
Additive Bayesian networks are types of graphical models that extend the usual Bayesian generalized linear model to multiple dependent variables through the factorisation of the joint probability distribution of the underlying variables. When fitting an ABN model, the choice of the prior of the parameters is of crucial importance. If an inadequate prior - like a too weakly informative one - is used, data separation and data sparsity lead to issues in the model selection process. In this work a simulation study between two weakly and a strongly informative priors is presented. As weakly informative prior we use a zero mean Gaussian prior with a large variance, currently implemented in the R-package abn. The second prior belongs to the Student's t-distribution, specifically designed for logistic regressions and, finally, the strongly informative prior is again Gaussian with mean equal to true parameter value and a small variance. We compare the impact of these priors on the accuracy of the learned additive Bayesian network in function of different parameters. We create a simulation study to illustrate Lindley's paradox based on the prior choice. We then conclude by highlighting the good performance of the informative Student's t-prior and the limited impact of the Lindley's paradox. Finally, suggestions for further developments are provided.
MLAug 3, 2018
Information-Theoretic Scoring Rules to Learn Additive Bayesian Network Applied to EpidemiologyGilles Kratzer, Reinhard Furrer
Bayesian network modelling is a well adapted approach to study messy and highly correlated datasets which are very common in, e.g., systems epidemiology. A popular approach to learn a Bayesian network from an observational datasets is to identify the maximum a posteriori network in a search-and-score approach. Many scores have been proposed both Bayesian or frequentist based. In an applied perspective, a suitable approach would allow multiple distributions for the data and is robust enough to run autonomously. A promising framework to compute scores are generalized linear models. Indeed, there exists fast algorithms for estimation and many tailored solutions to common epidemiological issues. The purpose of this paper is to present an R package abn that has an implementation of multiple frequentist scores and some realistic simulations that show its usability and performance. It includes features to deal efficiently with data separation and adjustment which are very common in systems epidemiology.
MLApr 19, 2018
varrank: an R package for variable ranking based on mutual information with applications to observed systemic datasetsGilles Kratzer, Reinhard Furrer
This article describes the R package varrank. It has a flexible implementation of heuristic approaches which perform variable ranking based on mutual information. The package is particularly suitable for exploring multivariate datasets requiring a holistic analysis. The core functionality is a general implementation of the minimum redundancy maximum relevance (mRMRe) model. This approach is based on information theory metrics. It is compatible with discrete and continuous data which are discretised using a large choice of possible rules. The two main problems that can be addressed by this package are the selection of the most representative variables for modeling a collection of variables of interest, i.e., dimension reduction, and variable ranking with respect to a set of variables of interest.