QM LGMay 14, 2021

Quantified Sleep: Machine learning techniques for observational n-of-1 studies

arXiv:2105.06811v11.2Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of handling complex, noisy data in personalized health tracking for individuals, though it is incremental as it applies existing methods to a specific domain.

The paper tackled the challenge of modeling sleep quality in an n-of-1 Quantified-Self study by applying statistical learning techniques to 472 days of personal data, resulting in a descriptive model that identified 16 key predictive features to narrow down factors for future interventional studies.

This paper applies statistical learning techniques to an observational Quantified-Self (QS) study to build a descriptive model of sleep quality. A total of 472 days of my sleep data was collected with an Oura ring and combined with lifestyle, environmental, and psychological data. Such n-of-1 QS projects pose a number of challenges: heterogeneous data sources; missing values; high dimensionality; dynamic feedback loops; human biases. This paper directly addresses these challenges with an end-to-end QS pipeline that produces robust descriptive models. Sleep quality is one of the most difficult modelling targets in QS research, due to high noise and a large number of weakly-contributing factors. Sleep quality was selected so that approaches from this paper would generalise to most other n-of-1 QS projects. Techniques are presented for combining and engineering features for the different classes of data types, sample frequencies, and schema - including event logs, weather, and geo-spatial data. Statistical analyses for outliers, normality, (auto)correlation, stationarity, and missing data are detailed, along with a proposed method for hierarchical clustering to identify correlated groups of features. The missing data was overcome using a combination of knowledge-based and statistical techniques, including several multivariate imputation algorithms. "Markov unfolding" is presented for collapsing the time series into a collection of independent observations, whilst incorporating historical information. The final model was interpreted in two ways: by inspecting the internal $β$-parameters, and using the SHAP framework. These two interpretation techniques were combined to produce a list of the 16 most-predictive features, demonstrating that an observational study can greatly narrow down the number of features that need to be considered when designing interventional QS studies.

View on arXiv PDF Code

Similar