Peer-reviewed veterinary case report
Comparative performance of bagging and boosting ensemble models for predicting lumpy skin disease with multiclass-imbalanced data.
- Journal:
- Scientific reports
- Year:
- 2025
- Authors:
- Gouda, Hagar F & Abdallah, Fatma D M
- Affiliation:
- Animal Wealth Development Department (Biostatistics subdivision)
Abstract
Ensemble machine learning (ML) algorithms, such as bagging and boosting, are powerful decision-support tools that enhance disease prediction and risk management in the veterinary field. Lumpy Skin Disease (LSD) poses a significant threat to livestock health and results in substantial economic losses. This study aims to predict LSD using 1,041 data records collected from six Egyptian governorates between June 2020 and October 2022. The dataset exhibits a multiclass imbalance with three outcome classes: Dead (6%), Diseased (32%), and Healthy (62%). To address this imbalance, we applied SMOTE, Random Oversampling (ROS), and Random Undersampling (RUS). Five ensemble models: Decision Tree (DT), Random Forest (RF), AdaBoost, Gradient Boosting (GBoost), and XGBoost were evaluated on both imbalanced and balanced datasets, with hyperparameter tuning via grid search and 10-fold cross-validation. Our findings highlight the superior performance of the RF model combined with ROS (RF-ROS), achieving the highest accuracy (82%) and AUC (0.93), followed by balanced XGBoost (81.25%, AUC = 0.93). AdaBoost and GBoost also improved significantly after oversampling and tuning. SHAP analysis identified vaccination status as the most important predictor, emphasizing targeted interventions. These results demonstrate that combining resampling with hyperparameter tuning enhances ML performance on imbalanced veterinary data.
Find similar cases for your pet
PetCaseFinder finds other peer-reviewed reports of pets with the same symptoms, plus a plain-English summary of what was tried across them.
Search related cases →Original publication: https://pubmed.ncbi.nlm.nih.gov/41214187/