Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction
This provides a practical Auto-ML solution for drug discovery researchers, though it is incremental as it builds on existing representation learning and pretraining methods.
The authors tackled the problem of limited labeled data and sensitivity to hyperparameters in deep learning-based QSAR models for molecular property prediction by developing Uni-QSAR, an Auto-ML tool that outperformed SOTA in 21 out of 22 tasks on the TDC benchmark with an average improvement of 6.09%.
Recently deep learning based quantitative structure-activity relationship (QSAR) models has shown surpassing performance than traditional methods for property prediction tasks in drug discovery. However, most DL based QSAR models are restricted to limited labeled data to achieve better performance, and also are sensitive to model scale and hyper-parameters. In this paper, we propose Uni-QSAR, a powerful Auto-ML tool for molecule property prediction tasks. Uni-QSAR combines molecular representation learning (MRL) of 1D sequential tokens, 2D topology graphs, and 3D conformers with pretraining models to leverage rich representation from large-scale unlabeled data. Without any manual fine-tuning or model selection, Uni-QSAR outperforms SOTA in 21/22 tasks of the Therapeutic Data Commons (TDC) benchmark under designed parallel workflow, with an average performance improvement of 6.09\%. Furthermore, we demonstrate the practical usefulness of Uni-QSAR in drug discovery domains.