Benji Maruyama

MTRL-SCI
4papers
215citations
Novelty31%
AI Score39

4 Papers

LGJan 20, 2023
Active Learning of Piecewise Gaussian Process Surrogates

Chiwoo Park, Robert Waelder, Bonggwon Kang et al.

Active learning of Gaussian process (GP) surrogates has been useful for optimizing experimental designs for physical/computer simulation experiments, and for steering data acquisition schemes in machine learning. In this paper, we develop a method for active learning of piecewise, Jump GP surrogates. Jump GPs are continuous within, but discontinuous across, regions of a design space, as required for applications spanning autonomous materials design, configuration of smart factory systems, and many others. Although our active learning heuristics are appropriated from strategies originally designed for ordinary GPs, we demonstrate that additionally accounting for model bias, as opposed to the usual model uncertainty, is essential in the Jump GP context. Toward that end, we develop an estimator for bias and variance of Jump GP models. Illustrations, and evidence of the advantage of our proposed methods, are provided on a suite of synthetic benchmarks, and real-simulation experiments of varying complexity.

65.7CEApr 3Code
ARES OS 2.0: An Orchestration Software Suite for Autonomous Experimentation Systems and Self-Driving Labs

Arthur W. N. Sloan, Robert W. Waelder, Morgen L. Smith et al.

ARES OS 2.0 (hereinafter ARES OS) is an open-source software suite to enable laboratory automation and closed-loop autonomous experimentation. Its function is to orchestrate experimental actions and data handoff between lab equipment, analysis routines, and experimental planning modules through a service-oriented architecture. ARES OS is abstracted to apply to general experimental flows common in materials science, chemistry, and biology and related disciplines. The core of ARES OS provides central control over all modules, along with the heavy lifting of UI creation, data management, and experimental design tools. ARES OS modules communicate with the core software over protobuf and gRPC, allowing them to be language-agnostic and user-creatable. This allows users to easily implement modules that control experimental hardware, process collected data , or plan experiments to meet their specific research needs. ARES OS lowers the barrier to entry for researchers to build their own self-driving labs, allowing them to focus on scientific programming for their use case and reducing the effort and time needed to bring an autonomous experimentation system online.

MTRL-SCIMay 23, 2021
Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science Domains

Qiaohao Liang, Aldair E. Gongora, Zekun Ren et al.

In the field of machine learning (ML) for materials optimization, active learning algorithms, such as Bayesian Optimization (BO), have been leveraged for guiding autonomous and high-throughput experimentation systems. However, very few studies have evaluated the efficiency of BO as a general optimization algorithm across a broad range of experimental materials science domains. In this work, we evaluate the performance of BO algorithms with a collection of surrogate model and acquisition function pairs across five diverse experimental materials systems, namely carbon nanotube polymer blends, silver nanoparticles, lead-halide perovskites, as well as additively manufactured polymer structures and shapes. By defining acceleration and enhancement metrics for general materials optimization objectives, we find that for surrogate model selection, Gaussian Process (GP) with anisotropic kernels (automatic relevance detection, ARD) and Random Forests (RF) have comparable performance and both outperform the commonly used GP without ARD. We discuss the implicit distributional assumptions of RF and GP, and the benefits of using GP with anisotropic kernels in detail. We provide practical insights for experimentalists on surrogate model selection of BO during materials optimization campaigns.

MLApr 2, 2019
Sequential Adaptive Design for Jump Regression Estimation

Chiwoo Park, Peihua Qiu, Jennifer Carpena-Núñez et al.

Selecting input variables or design points for statistical models has been of great interest in adaptive design and active learning. Motivated by two scientific examples, this paper presents a strategy of selecting the design points for a regression model when the underlying regression function is discontinuous. The first example we undertook was for the purpose of accelerating imaging speed in a high resolution material imaging; the second was use of sequential design for the purpose of mapping a chemical phase diagram. In both examples, the underlying regression functions have discontinuities, so many of the existing design optimization approaches cannot be applied because they mostly assume a continuous regression function. Although some existing adaptive design strategies developed from treed regression models can handle the discontinuities, the Bayesian approaches come with computationally expensive Markov Chain Monte Carlo techniques for posterior inferences and subsequent design point selections, which is not appropriate for the first motivating example that requires computation at least faster than the original imaging speed. In addition, the treed models are based on the domain partitioning that are inefficient when the discontinuities occurs over complex sub-domain boundaries. We propose a simple and effective adaptive design strategy for a regression analysis with discontinuities: some statistical properties with a fixed design will be presented first, and then these properties will be used to propose a new criterion of selecting the design points for the regression analysis. Sequential design with the new criterion will be presented with comprehensive simulated examples, and its application to the two motivating examples will be presented.