Clustering of Pain Dynamics in Sickle Cell Disease from Sparse, Uneven Samples
This work addresses the problem of analyzing sparse, uneven data for physicians and patients in sickle cell disease, though it is incremental as it adapts existing clustering methods to a specific data challenge.
The study tackled clustering irregularly sampled time series data, such as pain dynamics in sickle cell disease, by proposing and assessing four alignment methods for spectral clustering, finding that three clusters best described patient pain patterns.
Irregularly sampled time series data are common in a variety of fields. Many typical methods for drawing insight from data fail in this case. Here we attempt to generalize methods for clustering trajectories to irregularly and sparsely sampled data. We first construct synthetic data sets, then propose and assess four methods of data alignment to allow for application of spectral clustering. We also repeat the same process for real data drawn from medical records of patients with sickle cell disease -- patients whose subjective experiences of pain were tracked for several months via a mobile app. We find that different methods for aligning irregularly sampled sparse data sets can lead to different optimal numbers of clusters, even for synthetic data with known properties. For the case of sickle cell disease, we find that three clusters is a reasonable choice, and these appear to correspond to (1) a low pain group with occasionally acute pain, (2) a group which experiences moderate mean pain that fluctuates often from low to high, and (3) a group that experiences persistent high levels of pain. Our results may help physicians and patients better understand and manage patients' pain levels over time, and we expect that the methods we develop will apply to a wide range of other data sources in medicine and beyond.