MLDec 4, 2021
Data Fusion with Latent Map Gaussian ProcessesNicholas Oune, Jonathan Tammer Eweis-Labolle, Ramin Bostanabad
Multi-fidelity modeling and calibration are data fusion tasks that ubiquitously arise in engineering design. In this paper, we introduce a novel approach based on latent-map Gaussian processes (LMGPs) that enables efficient and accurate data fusion. In our approach, we convert data fusion into a latent space learning problem where the relations among different data sources are automatically learned. This conversion endows our approach with attractive advantages such as increased accuracy, reduced costs, flexibility to jointly fuse any number of data sources, and ability to visualize correlations between data sources. This visualization allows the user to detect model form errors or determine the optimum strategy for high-fidelity emulation by fitting LMGP only to the subset of the data sources that are well-correlated. We also develop a new kernel function that enables LMGPs to not only build a probabilistic multi-fidelity surrogate but also estimate calibration parameters with high accuracy and consistency. The implementation and use of our approach are considerably simpler and less prone to numerical issues compared to existing technologies. We demonstrate the benefits of LMGP-based data fusion by comparing its performance against competing methods on a wide range of examples.
MLFeb 7, 2021
Latent Map Gaussian Processes for Mixed Variable MetamodelingNicholas Oune, Ramin Bostanabad
Gaussian processes (GPs) are ubiquitously used in sciences and engineering as metamodels. Standard GPs, however, can only handle numerical or quantitative variables. In this paper, we introduce latent map Gaussian processes (LMGPs) that inherit the attractive properties of GPs and are also applicable to mixed data which have both quantitative and qualitative inputs. The core idea behind LMGPs is to learn a continuous, low-dimensional latent space or manifold which encodes all qualitative inputs. To learn this manifold, we first assign a unique prior vector representation to each combination of qualitative inputs. We then use a low-rank linear map to project these priors on a manifold that characterizes the posterior representations. As the posteriors are quantitative, they can be directly used in any standard correlation function such as the Gaussian or Matern. Hence, the optimal map and the corresponding manifold, along with other hyperparameters of the correlation function, can be systematically learned via maximum likelihood estimation. Through a wide range of analytic and real-world examples, we demonstrate the advantages of LMGPs over state-of-the-art methods in terms of accuracy and versatility. In particular, we show that LMGPs can handle variable-length inputs, have an explainable neural network interpretation, and provide insights into how qualitative inputs affect the response or interact with each other. We also employ LMGPs in Bayesian optimization and illustrate that they can discover optimal compound compositions more efficiently than conventional methods that convert compositions to qualitative variables via manual featurization.