CHEM-PHJun 28, 2020Code
Data-Driven Discovery of Molecular Photoswitches with Multioutput Gaussian ProcessesRyan-Rhys Griffiths, Jake L. Greenfield, Aditya R. Thawani et al.
Photoswitchable molecules display two or more isomeric forms that may be accessed using light. Separating the electronic absorption bands of these isomers is key to selectively addressing a specific isomer and achieving high photostationary states whilst overall red-shifting the absorption bands serves to limit material damage due to UV-exposure and increases penetration depth in photopharmacological applications. Engineering these properties into a system through synthetic design however, remains a challenge. Here, we present a data-driven discovery pipeline for molecular photoswitches underpinned by dataset curation and multitask learning with Gaussian processes. In the prediction of electronic transition wavelengths, we demonstrate that a multioutput Gaussian process (MOGP) trained using labels from four photoswitch transition wavelengths yields the strongest predictive performance relative to single-task models as well as operationally outperforming time-dependent density functional theory (TD-DFT) in terms of the wall-clock time for prediction. We validate our proposed approach experimentally by screening a library of commercially available photoswitchable molecules. Through this screen, we identified several motifs that displayed separated electronic absorption bands of their isomers, exhibited red-shifted absorptions, and are suited for information transfer and photopharmacological applications. Our curated dataset, code, as well as all models are made available at https://github.com/Ryan-Rhys/The-Photoswitch-Dataset
SOFTDec 19, 2020
Bayesian unsupervised learning reveals hidden structure in concentrated electrolytesPenelope Jones, Fabian Coupette, Andreas Härtel et al.
Electrolytes play an important role in a plethora of applications ranging from energy storage to biomaterials. Notwithstanding this, the structure of concentrated electrolytes remains enigmatic. Many theoretical approaches attempt to model the concentrated electrolytes by introducing the idea of ion pairs, with ions either being tightly `paired' with a counter-ion, or `free' to screen charge. In this study we reframe the problem into the language of computational statistics, and test the null hypothesis that all ions share the same local environment. Applying the framework to molecular dynamics simulations, we show that this null hypothesis is not supported by data. Our statistical technique suggests the presence of distinct local ionic environments; surprisingly, these differences arise in like charge correlations rather than unlike charge attraction. The resulting fraction of particles in non-aggregated environments shows a universal scaling behaviour across different background dielectric constants and ionic concentrations.
AIApr 29, 2018
Precision Medicine as an Accelerator for Next Generation Cognitive SupercomputingEdmon Begoli, Jim Brase, Bambi DeLaRosa et al.
In the past several years, we have taken advantage of a number of opportunities to advance the intersection of next generation high-performance computing AI and big data technologies through partnerships in precision medicine. Today we are in the throes of piecing together what is likely the most unique convergence of medical data and computer technologies. But more deeply, we observe that the traditional paradigm of computer simulation and prediction needs fundamental revision. This is the time for a number of reasons. We will review what the drivers are, why now, how this has been approached over the past several years, and where we are heading.