LGMEApr 17, 2025

Predicting BVD Re-emergence in Irish Cattle From Highly Imbalanced Herd-Level Data Using Machine Learning Algorithms

arXiv:2504.13116v1h-index: 4
Originality Synthesis-oriented
AI Analysis

This addresses the problem of targeted surveillance for disease re-emergence in agriculture, specifically for Irish cattle farmers and health officials, and is incremental as it applies existing methods to a new dataset with practical improvements.

The study tackled predicting Bovine Viral Diarrhoea (BVD) re-emergence in Irish cattle using machine learning on imbalanced herd-level data, with random forests achieving high sensitivity and correctly identifying 219 of 250 positive herds while halving the number of herds needing testing compared to blanket-testing.

Bovine Viral Diarrhoea (BVD) has been the focus of a successful eradication programme in Ireland, with the herd-level prevalence declining from 11.3% in 2013 to just 0.2% in 2023. As the country moves toward BVD freedom, the development of predictive models for targeted surveillance becomes increasingly important to mitigate the risk of disease re-emergence. In this study, we evaluate the performance of a range of machine learning algorithms, including binary classification and anomaly detection techniques, for predicting BVD-positive herds using highly imbalanced herd-level data. We conduct an extensive simulation study to assess model performance across varying sample sizes and class imbalance ratios, incorporating resampling, class weighting, and appropriate evaluation metrics (sensitivity, positive predictive value, F1-score and AUC values). Random forests and XGBoost models consistently outperformed other methods, with the random forest model achieving the highest sensitivity and AUC across scenarios, including real-world prediction of 2023 herd status, correctly identifying 219 of 250 positive herds while halving the number of herds that require compared to a blanket-testing strategy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes