MLLGAPP-PHMar 26, 2020

Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge

arXiv:2003.12127v2135 citations
AI Analysis

This addresses the need for efficient autonomous experimentation strategies in materials science and chemistry, offering a novel method that accelerates discovery and provides insights, though it builds incrementally on existing Bayesian optimization techniques.

The paper tackles the problem of optimizing categorical variables like catalysts or solvents in materials and chemistry, introducing Gryffin, a Bayesian optimization framework that uses expert knowledge via descriptors; results show it is competitive with state-of-the-art methods and outperforms them when leveraging domain knowledge, as demonstrated in examples such as discovering non-fullerene acceptors and designing perovskites.

Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise efficient strategies for the selection of categorical variables. Here, we introduce Gryffin, a general purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization based on kernel density estimation with smooth approximations to categorical distributions. Leveraging domain knowledge in the form of physicochemical descriptors, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions. Our results suggest that Gryffin, in its simplest form, is competitive with state-of-the-art categorical optimization algorithms. However, when leveraging domain knowledge provided via descriptors, Gryffin outperforms other approaches while simultaneously refining this domain knowledge to promote scientific understanding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes