LG AI DB APJan 25, 2024

Empowering Machines to Think Like Chemists: Unveiling Molecular Structure-Polarity Relationships with Hierarchical Symbolic Regression

Siyu Lou, Chengchun Liu, Yuntian Chen, Fanyang Mo

arXiv:2401.13904v14.62 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of balancing expressiveness and interpretability in AI models for chemists analyzing molecular polarity, representing an incremental improvement over existing methods.

The paper tackled the challenge of interpretability in predictive models for thin-layer chromatography (TLC) by introducing Unsupervised Hierarchical Symbolic Regression (UHiSR), which automatically distills chemical-intuitive polarity indices and discovers interpretable equations linking molecular structure to chromatographic behavior.

Thin-layer chromatography (TLC) is a crucial technique in molecular polarity analysis. Despite its importance, the interpretability of predictive models for TLC, especially those driven by artificial intelligence, remains a challenge. Current approaches, utilizing either high-dimensional molecular fingerprints or domain-knowledge-driven feature engineering, often face a dilemma between expressiveness and interpretability. To bridge this gap, we introduce Unsupervised Hierarchical Symbolic Regression (UHiSR), combining hierarchical neural networks and symbolic regression. UHiSR automatically distills chemical-intuitive polarity indices, and discovers interpretable equations that link molecular structure to chromatographic behavior.

View on arXiv PDF Code

Similar