LGAIMar 31, 2025

Evaluating machine learning models for predicting pesticides toxicity to honey bees

arXiv:2503.24305v36 citationsh-index: 5Ecotoxicol Environ Saf
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of predicting chemical toxicity for honey bees, an ecologically vital pollinator, but is incremental as it primarily benchmarks existing methods on a new dataset.

The study evaluated machine learning models for predicting pesticide toxicity to honey bees using the ApisTox dataset, finding that current state-of-the-art algorithms trained on biomedical data perform poorly on this agrochemical dataset, highlighting a need for domain-specific development.

Small molecules play a critical role in the biomedical, environmental, and agrochemical domains, each with distinct physicochemical requirements and success criteria. Although biomedical research benefits from extensive datasets and established benchmarks, agrochemical data remain scarce, particularly with respect to species-specific toxicity. This work focuses on ApisTox, the most comprehensive dataset of experimentally validated chemical toxicity to the honey bee (Apis mellifera), an ecologically vital pollinator. We evaluate ApisTox using a diverse suite of machine learning approaches, including molecular fingerprints, graph kernels, and graph neural networks, as well as pretrained models. Comparative analysis with medicinal datasets from the MoleculeNet benchmark reveals that ApisTox represents a distinct chemical space. Performance degradation on non-medicinal datasets, such as ApisTox, demonstrates their limited generalizability of current state-of-the-art algorithms trained solely on biomedical data. Our study highlights the need for more diverse datasets and for targeted model development geared toward the agrochemical domain.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes