KANEL: Kolmogorov-Arnold Network Ensemble Learning Enables Early Hit Enrichment in High-Throughput Virtual Screening

Pavel Koptev, Nikita Krainov, Konstantin Malkov, Alexander Tropsha

arXiv:2603.2575534.8h-index: 11

AI Analysis

For computational drug discovery, KANEL provides a practical ensemble method that improves early hit enrichment, a key metric for prioritizing compounds in high-throughput screening.

KANEL combines Kolmogorov-Arnold Networks with traditional ML models on complementary molecular representations to improve early hit enrichment (PPV@N) in virtual screening, outperforming individual models and ensembles.

Machine learning models of chemical bioactivity are increasingly used for prioritizing a small number of compounds in virtual screening libraries for experimental follow-up. In these applications, assessing model accuracy by early hit enrichment such as Positive Predicted Value (PPV) calculated for top N hits (PPV@N) is more appropriate and actionable than traditional global metrics such as AUC. We present KANEL, an ensemble workflow that combines interpretable Kolmogorov-Arnold Networks (KANs) with XGBoost, random forest, and multilayer perceptron models trained on complementary molecular representations (LillyMol descriptors, RDKit-derived descriptors, and Morgan fingerprints).

View on arXiv PDF

Similar