LGAug 9, 2024

Cycle-Configuration: A Novel Graph-theoretic Descriptor Set for Molecular Inference

arXiv:2408.05136v11 citationsh-index: 18
Originality Incremental advance
AI Analysis

This work addresses molecular inference for chemistry and drug discovery by providing a novel descriptor set, though it is incremental as it builds on an existing two-layered model framework.

The paper tackles the problem of molecular inference by introducing a novel graph-theoretic descriptor set called cycle-configuration (CC) that captures ortho/meta/para patterns in aromatic rings, which were previously impossible to represent. The result shows that using CC descriptors in a two-layered model improves or maintains performance for all 27 tested chemical properties and enables inference of chemical graphs with up to 50 non-hydrogen vertices in practical time.

In this paper, we propose a novel family of descriptors of chemical graphs, named cycle-configuration (CC), that can be used in the standard "two-layered (2L) model" of mol-infer, a molecular inference framework based on mixed integer linear programming (MILP) and machine learning (ML). Proposed descriptors capture the notion of ortho/meta/para patterns that appear in aromatic rings, which has been impossible in the framework so far. Computational experiments show that, when the new descriptors are supplied, we can construct prediction functions of similar or better performance for all of the 27 tested chemical properties. We also provide an MILP formulation that asks for a chemical graph with desired properties under the 2L model with CC descriptors (2L+CC model). We show that a chemical graph with up to 50 non-hydrogen vertices can be inferred in a practical time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes