LGDMSep 6, 2024

A Unified Approach to Inferring Chemical Compounds with the Desired Aqueous Solubility

arXiv:2409.04301v12 citationsh-index: 18Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for efficient solubility prediction in drug discovery and material design, offering a simpler alternative to complex models, though it appears incremental in method.

The researchers tackled the problem of predicting and inferring chemical compounds with desired aqueous solubility, achieving prediction accuracies between 0.7191 and 0.9377 across 29 datasets and inferring optimal compounds in 6 to 1204 seconds.

Aqueous solubility (AS) is a key physiochemical property that plays a crucial role in drug discovery and material design. We report a novel unified approach to predict and infer chemical compounds with the desired AS based on simple deterministic graph-theoretic descriptors, multiple linear regression (MLR) and mixed integer linear programming (MILP). Selected descriptors based on a forward stepwise procedure enabled the simplest regression model, MLR, to achieve significantly good prediction accuracy compared to the existing approaches, achieving the accuracy in the range [0.7191, 0.9377] for 29 diverse datasets. By simulating these descriptors and learning models as MILPs, we inferred mathematically exact and optimal compounds with the desired AS, prescribed structures, and up to 50 non-hydrogen atoms in a reasonable time range [6, 1204] seconds. These findings indicate a strong correlation between the simple graph-theoretic descriptors and the AS of compounds, potentially leading to a deeper understanding of their AS without relying on widely used complicated chemical descriptors and complex machine learning models that are computationally expensive, and therefore difficult to use for inference. An implementation of the proposed approach is available at https://github.com/ku-dml/mol-infer/tree/master/AqSol.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes