LGMay 17

When Molecular Similarity Works: Property Cliffs Reveal Hidden Errors

arXiv:2605.1726564.5Has Code
AI Analysis

For practitioners in drug discovery and materials science, this work provides a method to detect and mitigate a previously hidden failure mode in molecular property prediction models.

The paper identifies that molecular property prediction models fail in local neighborhoods where structurally similar molecules have sharply different properties (property cliffs), and introduces CliffSplit (a cliff-aware evaluation protocol) and CliffLoss (a mitigation mechanism) that reveal at least 15% higher error in cliff-heavy QM9 regions and reduce the cliff-to-smooth error gap by up to 30% on Lipophilicity, improving overall MAE by 9.7%.

Accurate prediction of molecular properties underpins drug discovery and material design, yet even state-of-the-art models remain vulnerable to localized failure modes that aggregate metrics cannot detect. The places where molecular similarity should be most helpful are also places where standard evaluation can be most misleading. Property cliffs expose this gap: structurally similar molecules can still differ sharply in target property, so models with competitive overall performance may fail in high-risk local neighborhoods. To expose and mitigate this failure mode, CliffSplit, a cliff-aware evaluation protocol that constructs locally supported, cliff-exposed test cases, and CliffLoss, a model-agnostic train-only mitigation mechanism for cliff-sensitive errors, are introduced. Experiments on three QM9 targets and three MoleculeNet tasks across five backbones show that CliffSplit reveals at least 15% higher error in cliff-heavy QM9 regions, while CliffLoss reduces the cliff-to-smooth error gap by up to 30% on Lipophilicity and improves overall MAE by 9.7%. Together, these results turn molecular similarity failure from a descriptive anomaly into a benchmarked evaluation problem for molecular machine learning. The code is available at https://anonymous.4open.science/r/Cliff_Loss.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes