LGAIJun 11, 2025

Enhancing Bagging Ensemble Regression with Data Integration for Time Series-Based Diabetes Prediction

arXiv:2506.13786v12 citationsh-index: 17ICCCI
Originality Incremental advance
AI Analysis

This work addresses the need for accurate state-level diabetes predictions to aid healthcare planning, but it is incremental as it builds on existing bagging ensemble methods with data integration.

This study tackled the problem of predicting diabetes prevalence across U.S. cities by integrating diabetes-related datasets from 2011 to 2021 and introducing an enhanced bagging ensemble regression model (EBMBag+), achieving an MAE of 0.41, RMSE of 0.53, MAPE of 4.01, and R2 of 0.9.

Diabetes is a chronic metabolic disease characterized by elevated blood glucose levels, leading to complications like heart disease, kidney failure, and nerve damage. Accurate state-level predictions are vital for effective healthcare planning and targeted interventions, but in many cases, data for necessary analyses are incomplete. This study begins with a data engineering process to integrate diabetes-related datasets from 2011 to 2021 to create a comprehensive feature set. We then introduce an enhanced bagging ensemble regression model (EBMBag+) for time series forecasting to predict diabetes prevalence across U.S. cities. Several baseline models, including SVMReg, BDTree, LSBoost, NN, LSTM, and ERMBag, were evaluated for comparison with our EBMBag+ algorithm. The experimental results demonstrate that EBMBag+ achieved the best performance, with an MAE of 0.41, RMSE of 0.53, MAPE of 4.01, and an R2 of 0.9.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes