SPCVIVOct 26, 2025

Benchmarking ResNet for Short-Term Hypoglycemia Classification with DiaData

arXiv:2511.02849v15 citationsh-index: 4IEEE journal of biomedical and health informatics
Originality Synthesis-oriented
AI Analysis

This provides incremental improvements for Type 1 Diabetes management by enhancing data quality and benchmarking classification.

This study improved data quality in the DiaData dataset for Type 1 Diabetes by cleaning outliers and imputing missing values, then benchmarked a ResNet model for hypoglycemia classification, achieving a 7% performance improvement with more data and 2-3% gain with refined data.

Individualized therapy is driven forward by medical data analysis, which provides insight into the patient's context. In particular, for Type 1 Diabetes (T1D), which is an autoimmune disease, relationships between demographics, sensor data, and context can be analyzed. However, outliers, noisy data, and small data volumes cannot provide a reliable analysis. Hence, the research domain requires large volumes of high-quality data. Moreover, missing values can lead to information loss. To address this limitation, this study improves the data quality of DiaData, an integration of 15 separate datasets containing glucose values from 2510 subjects with T1D. Notably, we make the following contributions: 1) Outliers are identified with the interquartile range (IQR) approach and treated by replacing them with missing values. 2) Small gaps ($\le$ 25 min) are imputed with linear interpolation and larger gaps ($\ge$ 30 and $<$ 120 min) with Stineman interpolation. Based on a visual comparison, Stineman interpolation provides more realistic glucose estimates than linear interpolation for larger gaps. 3) After data cleaning, the correlation between glucose and heart rate is analyzed, yielding a moderate relation between 15 and 60 minutes before hypoglycemia ($\le$ 70 mg/dL). 4) Finally, a benchmark for hypoglycemia classification is provided with a state-of-the-art ResNet model. The model is trained with the Maindatabase and Subdatabase II of DiaData to classify hypoglycemia onset up to 2 hours in advance. Training with more data improves performance by 7% while using quality-refined data yields a 2-3% gain compared to raw data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes