Constraining the Parameters of High-Dimensional Models with Active Learning
This addresses a computational bottleneck for researchers in fields like particle physics and astronomy, offering an incremental improvement over existing sampling methods.
The paper tackles the problem of constraining parameters in high-dimensional physical models, which is computationally expensive, by using active learning techniques like query-by-committee and query-by-dropout-committee to identify key model points, resulting in more efficient parameter constraints and better-performing machine learning models with the same data.
Constraining the parameters of physical models with $>5-10$ parameters is a widespread problem in fields like particle physics and astronomy. The generation of data to explore this parameter space often requires large amounts of computational resources. The commonly used solution of reducing the number of relevant physical parameters hampers the generality of the results. In this paper we show that this problem can be alleviated by the use of active learning. We illustrate this with examples from high energy physics, a field where simulations are often expensive and parameter spaces are high-dimensional. We show that the active learning techniques query-by-committee and query-by-dropout-committee allow for the identification of model points in interesting regions of high-dimensional parameter spaces (e.g. around decision boundaries). This makes it possible to constrain model parameters more efficiently than is currently done with the most common sampling algorithms and to train better performing machine learning models on the same amount of data. Code implementing the experiments in this paper can be found on GitHub.