LGOCBMJul 6, 2021

An Inverse QSAR Method Based on Linear Regression and Integer Programming

arXiv:2107.02381v34 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for computational chemistry, offering a simpler method for molecular design.

The paper tackles the inverse QSAR problem by replacing artificial neural networks with linear regression to predict chemical properties, and it can infer chemical graphs with up to 50 non-hydrogen atoms.

Recently a novel framework has been proposed for designing the molecular structure of chemical compounds using both artificial neural networks (ANNs) and mixed integer linear programming (MILP). In the framework, we first define a feature vector $f(C)$ of a chemical graph $C$ and construct an ANN that maps $x=f(C)$ to a predicted value $η(x)$ of a chemical property $π$ to $C$. After this, we formulate an MILP that simulates the computation process of $f(C)$ from $C$ and that of $η(x)$ from $x$. Given a target value $y^*$ of the chemical property $π$, we infer a chemical graph $C^\dagger$ such that $η(f(C^\dagger))=y^*$ by solving the MILP. In this paper, we use linear regression to construct a prediction function $η$ instead of ANNs. For this, we derive an MILP formulation that simulates the computation process of a prediction function by linear regression. The results of computational experiments suggest our method can infer chemical graphs with around up to 50 non-hydrogen atoms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes