AI4AI: Quantitative Methods for Classifying Host Species from Avian Influenza DNA Sequence
This addresses the need for faster, resource-efficient methods during Avian Influenza breakouts, particularly in Asian countries, though it appears incremental as it applies existing ML techniques to a specific domain.
The study tackled the problem of classifying host species from Avian Influenza DNA sequences using machine learning and deep learning, achieving top-1 accuracy of 47% and top-3 accuracy of 82% on a dataset of 11 species.
Avian Influenza breakouts cause millions of dollars in damage each year globally, especially in Asian countries such as China and South Korea. The impact magnitude of a breakout directly correlates to time required to fully understand the influenza virus, particularly the interspecies pathogenicity. The procedure requires laboratory tests that require resources typically lacking in a breakout emergency. In this study, we propose new quantitative methods utilizing machine learning and deep learning to correctly classify host species given raw DNA sequence data of the influenza virus, and provide probabilities for each classification. The best deep learning models achieve top-1 classification accuracy of 47%, and top-3 classification accuracy of 82%, on a dataset of 11 host species classes.