Clustering Higher Order Data: An Application to Pediatric Multi-variable Longitudinal Data
This work addresses the need for clustering multivariate longitudinal data in clinical settings, such as predicting cardiovascular health in youth with chronic conditions, but it is incremental as it extends existing methods to higher dimensions.
The paper tackles the problem of clustering higher-order data, specifically 4-dimensional accelerometer arrays from pediatric longitudinal studies, by developing a finite mixture model for multidimensional arrays, extending previous work limited to 2D arrays.
Physical activity levels are an important predictor of cardiovascular health and increasingly being measured by sensors, like accelerometers. Accelerometers produce rich multivariate data that can inform important clinical decisions related to individual patients and public health. The CHAMPION study, a study of youth with chronic inflammatory conditions, aims to determine the links between heart health, inflammation, physical activity, and fitness. The accelerometer data from CHAMPION is represented as 4-dimensional arrays, and a finite mixture of multidimensional arrays model is developed for clustering. The use of model-based clustering for multidimensional arrays has thus far been limited to two-dimensional arrays, i.e., matrices or order-two tensors, and the work in this paper can also be seen as an approach for clustering D-dimensional arrays for D > 2 or, in other words, for clustering order-D tensors.