Predicting the activity of chemical compounds based on machine learning approaches
This work addresses a cheminformatics problem for researchers, but it is incremental as it relies on existing methods without introducing new paradigms.
The study tackled predicting chemical compound activity by testing 100 combinations of existing machine learning techniques on a dataset of 10,000 compounds from PubChem, achieving results based on G-means, F1-score, and AUC metrics.
Exploring methods and techniques of machine learning (ML) to address specific challenges in various fields is essential. In this work, we tackle a problem in the domain of Cheminformatics; that is, providing a suitable solution to aid in predicting the activity of a chemical compound to the best extent possible. To address the problem at hand, this study conducts experiments on 100 different combinations of existing techniques. These solutions are then selected based on a set of criteria that includes the G-means, F1-score, and AUC metrics. The results have been tested on a dataset of about 10,000 chemical compounds from PubChem that have been classified according to their activity