LGQMJun 20, 2023

MoleCLUEs: Molecular Conformers Maximally In-Distribution for Predictive Models

arXiv:2306.11681v2h-index: 4
Originality Incremental advance
AI Analysis

This addresses the problem of unreliable predictions in molecular machine learning for drug discovery, though it is an incremental improvement on existing uncertainty estimation methods.

The paper tackles the sensitivity of structure-based molecular ML models to input geometries by generating conformers that minimize predictive uncertainty, resulting in reduced variance and improved confidence in drug property predictions.

Structure-based molecular ML (SBML) models can be highly sensitive to input geometries and give predictions with large variance. We present an approach to mitigate the challenge of selecting conformations for such models by generating conformers that explicitly minimize predictive uncertainty. To achieve this, we compute estimates of aleatoric and epistemic uncertainties that are differentiable w.r.t. latent posteriors. We then iteratively sample new latents in the direction of lower uncertainty by gradient descent. As we train our predictive models jointly with a conformer decoder, the new latent embeddings can be mapped to their corresponding inputs, which we call \textit{MoleCLUEs}, or (molecular) counterfactual latent uncertainty explanations \citep{antoran2020getting}. We assess our algorithm for the task of predicting drug properties from 3D structure with maximum confidence. We additionally analyze the structure trajectories obtained from conformer optimizations, which provide insight into the sources of uncertainty in SBML.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes