ADMET property prediction through combinations of molecular fingerprints
This work provides an incremental improvement in ADMET property prediction for drug discovery, emphasizing richer molecular representations.
The researchers tackled the problem of predicting small molecule ADMET properties by evaluating various machine learning algorithms and molecular fingerprints, finding that gradient-boosted decision trees with combined fingerprints and molecular properties achieved the best performance, validated across 22 benchmarks.
While investigating methods to predict small molecule potencies, we found random forests or support vector machines paired with extended-connectivity fingerprints (ECFP) consistently outperformed recently developed methods. A detailed investigation into regression algorithms and molecular fingerprints revealed gradient-boosted decision trees, particularly CatBoost, in conjunction with a combination of ECFP, Avalon, and ErG fingerprints, as well as 200 molecular properties, to be most effective. Incorporating a graph neural network fingerprint further enhanced performance. We successfully validated our model across 22 Therapeutics Data Commons ADMET benchmarks. Our findings underscore the significance of richer molecular representations for accurate property prediction.