LGJan 29
Making Foundation Models Probabilistic via Singular Value EnsemblesMehmet Ozgur Turkoglu, Dominik J. Mühlematter, Alexander Becker et al.
Foundation models have become a dominant paradigm in machine learning, achieving remarkable performance across diverse tasks through large-scale pretraining. However, these models often yield overconfident, uncalibrated predictions. The standard approach to quantifying epistemic uncertainty, training an ensemble of independent models, incurs prohibitive computational costs that scale linearly with ensemble size, making it impractical for large foundation models. We propose Singular Value Ensemble (SVE), a parameter-efficient implicit ensemble method that builds on a simple, but powerful core assumption: namely, that the singular vectors of the weight matrices constitute meaningful subspaces of the model's knowledge. Pretrained foundation models encode rich, transferable information in their weight matrices. If the singular vectors are indeed meaningful (orthogonal) "knowledge directions". To obtain a model ensemble, we modulate only how strongly each direction contributes to the output. Rather than learning entirely new parameters, we freeze the singular vectors and only train per-member singular values that rescale the contribution of each direction in that shared knowledge basis. Ensemble diversity emerges naturally as stochastic initialization and random sampling of mini-batches during joint training cause different members to converge to different combinations of the same underlying knowledge. SVE achieves uncertainty quantification comparable to explicit deep ensembles while increasing the parameter count of the base model by less than 1%, making principled uncertainty estimation accessible in resource-constrained settings. We validate SVE on NLP and vision tasks with various different backbones and show that it improves calibration while maintaining predictive accuracy.
LGMay 23, 2024
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention NetworksDominik J. Mühlematter, Michelle Halbheer, Alexander Becker et al.
Numerous real-world decisions rely on machine learning algorithms and require calibrated uncertainty estimates. However, modern methods often yield overconfident, uncalibrated predictions. The dominant approach to quantifying the uncertainty inherent in the model is to train an ensemble of separate predictors and measure their empirical variance. In an explicit implementation, the ensemble has high computational cost and memory footprint, especially if the base model itself is already large, like modern transformers. This motivates efforts to develop implicit ensemble methods that emulate the ensemble without explicitly instantiating all its members. We introduce LoRA-Ensemble, a parameter-efficient ensembling method for self-attention networks. It is based on Low-Rank Adaptation (LoRA), originally developed for efficient LLM fine-tuning, and extends it into an implicit ensembling scheme, where all ensemble members share the same, pre-trained self-attention network, but have individual low-rank matrices for the attention projections. The resulting method not only outperforms state-of-the-art implicit techniques like BatchEnsemble, but even matches or exceeds the accuracy of an Explicit Ensemble, while at the same time achieving superior calibration.
CVJun 15, 2025
Model-Agnostic, Temperature-Informed Sampling Enhances Cross-Year Crop Mapping with Deep LearningMehmet Ozgur Turkoglu, Selene Ledain, Helge Aasen
Crop type classification using optical satellite time series remains limited in its ability to generalize across seasons, particularly when crop phenology shifts due to inter-annual weather variability. This hampers real-world applicability in scenarios where current-year labels are unavailable. In addition, uncertainty quantification is often overlooked, which reduces the reliability of such approaches for operational crop monitoring. Inspired by ecophysiological principles of plant growth, we propose a simple, model-agnostic Thermal-Time-based Temporal Sampling (T3S) method that replaces calendar time with thermal time. By subsampling time series in this biologically meaningful way, our method highlights key periods within the growing season while reducing temporal redundancy and noise. We evaluate the T3S on a multi-year Sentinel-2 dataset covering the entirety of Switzerland, which allows us to assess all applied methods on unseen years. Compared to state-of-the-art baselines, our approach yields substantial improvements in classification accuracy and, critically, provides well-calibrated uncertainty estimates. Moreover, the T3S method excels in low-data regimes and enables significantly more accurate early-season classification. With just 10% of the training labels, it outperforms the current baseline in both accuracy and uncertainty calibration, and by the end of June, it achieves a performance similar to the full-season baseline model.
CVMay 17, 2021
Global Wheat Head Dataset 2021: more diversity to improve the benchmarking of wheat head localization methodsEtienne David, Mario Serouart, Daniel Smith et al.
The Global Wheat Head Detection (GWHD) dataset was created in 2020 and has assembled 193,634 labelled wheat heads from 4,700 RGB images acquired from various acquisition platforms and 7 countries/institutions. With an associated competition hosted in Kaggle, GWHD has successfully attracted attention from both the computer vision and agricultural science communities. From this first experience in 2020, a few avenues for improvements have been identified, especially from the perspective of data size, head diversity and label reliability. To address these issues, the 2020 dataset has been reexamined, relabeled, and augmented by adding 1,722 images from 5 additional countries, allowing for 81,553 additional wheat heads to be added. We now release a new version of the Global Wheat Head Detection (GWHD) dataset in 2021, which is bigger, more diverse, and less noisy than the 2020 version. The GWHD 2021 is now publicly available at http://www.global-wheat.com/ and a new data challenge has been organized on AIcrowd to make use of this updated dataset.