Presenting DiaData for Research on Type 1 Diabetes
This provides a new large-scale dataset for researchers in diabetes care, enabling better machine learning models for predicting hypoglycemia, though it is incremental as it focuses on data integration rather than novel methods.
This work tackles the problem of limited large datasets for type 1 diabetes research by systematically integrating 15 datasets into a database of 2510 subjects with 149 million glucose measurements, including 4% in the hypoglycemic range, and conducts a correlation study showing a relation between glucose levels and heart rate data 15 to 55 minutes before hypoglycemia.
Type 1 diabetes (T1D) is an autoimmune disorder that leads to the destruction of insulin-producing cells, resulting in insulin deficiency, as to why the affected individuals depend on external insulin injections. However, insulin can decrease blood glucose levels and can cause hypoglycemia. Hypoglycemia is a severe event of low blood glucose levels ($\le$70 mg/dL) with dangerous side effects of dizziness, coma, or death. Data analysis can significantly enhance diabetes care by identifying personal patterns and trends leading to adverse events. Especially, machine learning (ML) models can predict glucose levels and provide early alarms. However, diabetes and hypoglycemia research is limited by the unavailability of large datasets. Thus, this work systematically integrates 15 datasets to provide a large database of 2510 subjects with glucose measurements recorded every 5 minutes. In total, 149 million measurements are included, of which 4% represent values in the hypoglycemic range. Moreover, two sub-databases are extracted. Sub-database I includes demographics, and sub-database II includes heart rate data. The integrated dataset provides an equal distribution of sex and different age levels. As a further contribution, data quality is assessed, revealing that data imbalance and missing values present a significant challenge. Moreover, a correlation study on glucose levels and heart rate data is conducted, showing a relation between 15 and 55 minutes before hypoglycemia.