LGMar 28, 2022
Safe Active Learning for Multi-Output Gaussian ProcessesCen-You Li, Barbara Rakitsch, Christoph Zimmer
Multi-output regression problems are commonly encountered in science and engineering. In particular, multi-output Gaussian processes have been emerged as a promising tool for modeling these complex systems since they can exploit the inherent correlations and provide reliable uncertainty estimates. In many applications, however, acquiring the data is expensive and safety concerns might arise (e.g. robotics, engineering). We propose a safe active learning approach for multi-output Gaussian process regression. This approach queries the most informative data or output taking the relatedness between the regressors and safety constraints into account. We prove the effectiveness of our approach by providing theoretical analysis and by demonstrating empirical results on simulated datasets and on a real-world engineering dataset. On all datasets, our approach shows improved convergence compared to its competitors.
LGAug 3, 2024
Batch Active Learning in Gaussian Process Regression using DerivativesHon Sum Alec Yu, Christoph Zimmer, Duy Nguyen-Tuong
We investigate the use of derivative information for Batch Active Learning in Gaussian Process regression models. The proposed approach employs the predictive covariance matrix for selection of data batches to exploit full correlation of samples. We theoretically analyse our proposed algorithm taking different optimality criteria into consideration and provide empirical comparisons highlighting the advantage of incorporating derivatives information. Our results show the effectiveness of our approach across diverse applications.
LGOct 21, 2022
Structural Kernel Search via Bayesian Optimization and Symbolical Optimal TransportMatthias Bitzer, Mona Meister, Christoph Zimmer
Despite recent advances in automated machine learning, model selection is still a complex and computationally intensive process. For Gaussian processes (GPs), selecting the kernel is a crucial task, often done manually by the expert. Additionally, evaluating the model selection criteria for Gaussian processes typically scales cubically in the sample size, rendering kernel search particularly computationally expensive. We propose a novel, efficient search method through a general, structured kernel space. Previous methods solved this task via Bayesian optimization and relied on measuring the distance between GP's directly in function space to construct a kernel-kernel. We present an alternative approach by defining a kernel-kernel over the symbolic representation of the statistical hypothesis that is associated with a kernel. We empirically show that this leads to a computationally more efficient way of searching through a discrete kernel space.
LGJun 16, 2023
Amortized Inference for Gaussian Process Hyperparameters of Structured KernelsMatthias Bitzer, Mona Meister, Christoph Zimmer
Learning the kernel parameters for Gaussian processes is often the computational bottleneck in applications such as online learning, Bayesian optimization, or active learning. Amortizing parameter inference over different datasets is a promising approach to dramatically speed up training time. However, existing methods restrict the amortized inference procedure to a fixed kernel structure. The amortization network must be redesigned manually and trained again in case a different kernel is employed, which leads to a large overhead in design time and training time. We propose amortizing kernel parameter inference over a complete kernel-structure-family rather than a fixed kernel structure. We do that via defining an amortization network over pairs of datasets and kernel structures. This enables fast kernel inference for each element in the kernel family without retraining the amortization network. As a by-product, our amortization network is able to do fast ensembling over kernel structures. In our experiments, we show drastically reduced inference time combined with competitive test performance for a large set of kernels and datasets.
LGMar 17, 2023
Hierarchical-Hyperplane Kernels for Actively Learning Gaussian Process Models of Nonstationary SystemsMatthias Bitzer, Mona Meister, Christoph Zimmer
Learning precise surrogate models of complex computer simulations and physical machines often require long-lasting or expensive experiments. Furthermore, the modeled physical dependencies exhibit nonlinear and nonstationary behavior. Machine learning methods that are used to produce the surrogate model should therefore address these problems by providing a scheme to keep the number of queries small, e.g. by using active learning and be able to capture the nonlinear and nonstationary properties of the system. One way of modeling the nonstationarity is to induce input-partitioning, a principle that has proven to be advantageous in active learning for Gaussian processes. However, these methods either assume a known partitioning, need to introduce complex sampling schemes or rely on very simple geometries. In this work, we present a simple, yet powerful kernel family that incorporates a partitioning that: i) is learnable via gradient-based methods, ii) uses a geometry that is more flexible than previous ones, while still being applicable in the low data regime. Thus, it provides a good prior for active learning procedures. We empirically demonstrate excellent performance on various active learning tasks.
NAJan 22, 2019
Time discretization schemes for hyperbolic systems on networks by $ε$-expansionRobert Altmann, Christoph Zimmer
We consider partial differential equations on networks with a small parameter $ε$, which are hyperbolic for $ε>0$ and parabolic for $ε=0$. With a combination of an $ε$-expansion and Runge-Kutta schemes for constrained systems of parabolic type, we derive a new class of time discretization schemes for hyperbolic systems on networks, which are constrained due to interconnection conditions. For the analysis we consider the coupled system equations as partial differential-algebraic equations based on the variational formulation of the problem. We discuss well-posedness of the resulting systems and estimate the error caused by the $ε$-expansion.
LGFeb 9, 2024
Safe Active Learning for Time-Series Modeling with Gaussian ProcessesChristoph Zimmer, Mona Meister, Duy Nguyen-Tuong
Learning time-series models is useful for many applications, such as simulation and forecasting. In this study, we consider the problem of actively learning time-series models while taking given safety constraints into account. For time-series modeling we employ a Gaussian process with a nonlinear exogenous input structure. The proposed approach generates data appropriate for time series model learning, i.e. input and output trajectories, by dynamically exploring the input space. The approach parametrizes the input trajectory as consecutive trajectory sections, which are determined stepwise given safety requirements and past observations. We analyze the proposed algorithm and evaluate it empirically on a technical application. The results show the effectiveness of our approach in a realistic technical use case.
LGDec 16, 2025
Causal Structure Learning for Dynamical Systems with Theoretical Score AnalysisNicholas Tagliapietra, Katharina Ensinger, Christoph Zimmer et al.
Real world systems evolve in continuous-time according to their underlying causal relationships, yet their dynamics are often unknown. Existing approaches to learning such dynamics typically either discretize time -- leading to poor performance on irregularly sampled data -- or ignore the underlying causality. We propose CaDyT, a novel method for causal discovery on dynamical systems addressing both these challenges. In contrast to state-of-the-art causal discovery methods that model the problem using discrete-time Dynamic Bayesian networks, our formulation is grounded in Difference-based causal models, which allow milder assumptions for modeling the continuous nature of the system. CaDyT leverages exact Gaussian Process inference for modeling the continuous-time dynamics which is more aligned with the underlying dynamical process. We propose a practical instantiation that identifies the causal structure via a greedy search guided by the Algorithmic Markov Condition and Minimum Description Length principle. Our experiments show that CaDyT outperforms state-of-the-art methods on both regularly and irregularly-sampled data, discovering causal networks closer to the true underlying dynamics.
LGJul 25, 2024
Amortized Active Learning for Nonparametric FunctionsCen-You Li, Marc Toussaint, Barbara Rakitsch et al.
Active learning (AL) is a sequential learning scheme aiming to select the most informative data. AL reduces data consumption and avoids the cost of labeling large amounts of data. However, AL trains the model and solves an acquisition optimization for each selection. It becomes expensive when the model training or acquisition optimization is challenging. In this paper, we focus on active nonparametric function learning, where the gold standard Gaussian process (GP) approaches suffer from cubic time complexity. We propose an amortized AL method, where new data are suggested by a neural network which is trained up-front without any real data (Figure 1). Our method avoids repeated model training and requires no acquisition optimization during the AL deployment. We (i) utilize GPs as function priors to construct an AL simulator, (ii) train an AL policy that can zero-shot generalize from simulation to real learning problems of nonparametric functions and (iii) achieve real-time data selection and comparable learning performances to time-consuming baseline methods.
LGFeb 28, 2024
Efficiently Computable Safety Bounds for Gaussian Processes in Active LearningJörn Tebbe, Christoph Zimmer, Ansgar Steland et al.
Active learning of physical systems must commonly respect practical safety constraints, which restricts the exploration of the design space. Gaussian Processes (GPs) and their calibrated uncertainty estimations are widely used for this purpose. In many technical applications the design space is explored via continuous trajectories, along which the safety needs to be assessed. This is particularly challenging for strict safety requirements in GP methods, as it employs computationally expensive Monte-Carlo sampling of high quantiles. We address these challenges by providing provable safety bounds based on the adaptively sampled median of the supremum of the posterior GP. Our method significantly reduces the number of samples required for estimating high safety probabilities, resulting in faster evaluation without sacrificing accuracy and exploration speed. The effectiveness of our safe active learning approach is demonstrated through extensive simulations and validated using a real-world engine example.
LGFeb 22, 2024
Global Safe Sequential Learning via Efficient Knowledge TransferCen-You Li, Olaf Duennbier, Marc Toussaint et al.
Sequential learning methods, such as active learning and Bayesian optimization, aim to select the most informative data for task learning. In many applications, however, data selection is constrained by unknown safety conditions, motivating the development of safe learning approaches. A promising line of safe learning methods uses Gaussian processes to model safety conditions, restricting data selection to areas with high safety confidence. However, these methods are limited to local exploration around an initial seed dataset, as safety confidence centers around observed data points. As a consequence, task exploration is slowed down and safe regions disconnected from the initial seed dataset remain unexplored. In this paper, we propose safe transfer sequential learning to accelerate task learning and to expand the explorable safe region. By leveraging abundant offline data from a related source task, our approach guides exploration in the target task more effectively. We also provide a theoretical analysis to explain why single-task method cannot cope with disconnected regions. Finally, we introduce a computationally efficient approximation of our method that reduces runtime through pre-computations. Our experiments demonstrate that this approach, compared to state-of-the-art methods, learns tasks with lower data consumption and enhances global exploration across multiple disjoint safe regions, while maintaining comparable computational efficiency.
LGJan 26, 2025
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric FunctionsCen-You Li, Marc Toussaint, Barbara Rakitsch et al.
Safe active learning (AL) is a sequential scheme for learning unknown systems while respecting safety constraints during data acquisition. Existing methods often rely on Gaussian processes (GPs) to model the task and safety constraints, requiring repeated GP updates and constrained acquisition optimization-incurring in significant computations which are challenging for real-time decision-making. We propose an amortized safe AL framework that replaces expensive online computations with a pretrained neural policy. Inspired by recent advances in amortized Bayesian experimental design, we turn GPs into a pretraining simulator. We train our policy prior to the AL deployment on simulated nonparametric functions, using Fourier feature-based GP sampling and a differentiable, safety-aware acquisition objective. At deployment, our policy selects safe and informative queries via a single forward pass, eliminating the need for GP inference or constrained optimization. This leads to substantial speed improvements while preserving safety and learning quality. Our framework is modular and can be adapted to unconstrained, time-sensitive AL tasks by omitting the safety requirement.
LGDec 12, 2024
Safe Active Learning for Gaussian Differential EquationsLeon Glass, Katharina Ensinger, Christoph Zimmer
Gaussian Process differential equations (GPODE) have recently gained momentum due to their ability to capture dynamics behavior of systems and also represent uncertainty in predictions. Prior work has described the process of training the hyperparameters and, thereby, calibrating GPODE to data. How to design efficient algorithms to collect data for training GPODE models is still an open field of research. Nevertheless high-quality training data is key for model performance. Furthermore, data collection leads to time-cost and financial-cost and might in some areas even be safety critical to the system under test. Therefore, algorithms for safe and efficient data collection are central for building high quality GPODE models. Our novel Safe Active Learning (SAL) for GPODE algorithm addresses this challenge by suggesting a mechanism to propose efficient and non-safety-critical data to collect. SAL GPODE does so by sequentially suggesting new data, measuring it and updating the GPODE model with the new data. In this way, subsequent data points are iteratively suggested. The core of our SAL GPODE algorithm is a constrained optimization problem maximizing information of new data for GPODE model training constrained by the safety of the underlying system. We demonstrate our novel SAL GPODE's superiority compared to a standard, non-active way of measuring new data on two relevant examples.
LGMay 17, 2024
Future Aware Safe Active Learning of Time Varying Systems using Gaussian ProcessesMarkus Lange-Hegermann, Christoph Zimmer
Experimental exploration of high-cost systems with safety constraints, common in engineering applications, is a challenging endeavor. Data-driven models offer a promising solution, but acquiring the requisite data remains expensive and is potentially unsafe. Safe active learning techniques prove essential, enabling the learning of high-quality models with minimal expensive data points and high safety. This paper introduces a safe active learning framework tailored for time-varying systems, addressing drift, seasonal changes, and complexities due to dynamic behavior. The proposed Time-aware Integrated Mean Squared Prediction Error (T-IMSPE) method minimizes posterior variance over current and future states, optimizing information gathering also in the time domain. Empirical results highlight T-IMSPE's advantages in model quality through toy and real-world examples. State of the art Gaussian processes are compatible with T-IMSPE. Our theoretical contributions include a clear delineation which Gaussian process kernels, domains, and weighting measures are suitable for T-IMSPE and even beyond for its non-time aware predecessor IMSPE.
LGJul 30, 2021
Active Learning in Gaussian Process State Space ModelHon Sum Alec Yu, Dingling Yao, Christoph Zimmer et al.
We investigate active learning in Gaussian Process state-space models (GPSSM). Our problem is to actively steer the system through latent states by determining its inputs such that the underlying dynamics can be optimally learned by a GPSSM. In order that the most informative inputs are selected, we employ mutual information as our active learning criterion. In particular, we present two approaches for the approximation of mutual information for the GPSSM given latent states. The proposed approaches are evaluated in several physical systems where we actively learn the underlying non-linear dynamics represented by the state-space model.