A Comparative Analysis of the Ensemble Methods for Drug Design
This research addresses the limitations of QSAR modeling for drug discovery by evaluating the performance of ensemble methods, which is an incremental contribution to the field.
This paper investigates the effectiveness of ensemble methods in Quantitative Structure-Activity Relationship (QSAR) modeling for drug design. They developed and compared 57 different algorithm configurations, pairing various ensemble algorithms with basic algorithms, across 4 distinct datasets to determine if ensembles consistently outperform individual algorithms.
Quantitative structure-activity relationship (QSAR) is a computer modeling technique for identifying relationships between the structural properties of chemical compounds and biological activity. QSAR modeling is necessary for drug discovery, but it has many limitations. Ensemble-based machine learning approaches have been used to overcome limitations and generate reliable predictions. Ensemble learning creates a set of diverse models and combines them. In our comparative analysis, each ensemble algorithm was paired with each of the basic algorithms, but the basic algorithms were also investigated separately. In this configuration, 57 algorithms were developed and compared on 4 different datasets. Thus, a technique for complex ensemble method is proposed that builds diversified models and integrates them. The proposed individual models did not show impressive results as a unified model, but it was considered the most important predictor when combined. We assessed whether ensembles always give better results than individual algorithms. The Python code written to get experimental results in this article has been uploaded to Github (https://github.com/rifqat/Comparative-Analysis).