Generative structured normalizing flow Gaussian processes applied to spectroscopic data
This work addresses uncertainty quantification in high-dimensional scientific data, such as spectroscopy, which is critical for applications like planetary exploration, but it appears incremental as it combines existing methods (normalizing flows and Gaussian processes) for a specific domain.
The authors tackled the problem of mapping inputs to structured, high-dimensional outputs with uncertainty quantification, particularly for limited training data in physical sciences, by proposing a generative model combining structured conditional normalizing flows and Gaussian process regression, and demonstrated it on spectroscopic data from Mars rover Curiosity to generate realistic spectra and quantify uncertainty in chemical compositions.
In this work, we propose a novel generative model for mapping inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequately characterize future observed data; it is critical that models adequately indicate uncertainty, particularly when they may be asked to extrapolate. In our proposed model, structured conditional normalizing flows provide parsimonious latent representations that relate to the inputs through a Gaussian process, providing exact likelihood calculations and uncertainty that naturally increases away from the training data inputs. We demonstrate the methodology on laser-induced breakdown spectroscopy data from the ChemCam instrument onboard the Mars rover Curiosity. ChemCam was designed to recover the chemical composition of rock and soil samples by measuring the spectral properties of plasma atomic emissions induced by a laser pulse. We show that our model can generate realistic spectra conditional on a given chemical composition and that we can use the model to perform uncertainty quantification of chemical compositions for new observed spectra. Based on our results, we anticipate that our proposed modeling approach may be useful in other scientific domains with high-dimensional, complex structure where it is important to quantify predictive uncertainty.