Jennifer Webster

2papers

2 Papers

LGNov 10, 2022
Spatiotemporal k-means

Olga Dorabiala, Devavrat Vivek Dabke, Jennifer Webster et al.

Spatiotemporal data is increasingly available due to emerging sensor and data acquisition technologies that track moving objects. Spatiotemporal clustering addresses the need to efficiently discover patterns and trends in moving object behavior without human supervision. One application of interest is the discovery of moving clusters, where clusters have a static identity, but their location and content can change over time. We propose a two phase spatiotemporal clustering method called spatiotemporal k-means (STkM) that is able to analyze the multi-scale relationships within spatiotemporal data. By optimizing an objective function that is unified over space and time, the method can track dynamic clusters at both short and long timescales with minimal parameter tuning and no post-processing. We begin by proposing a theoretical generating model for spatiotemporal data and prove the efficacy of STkM in this setting. We then evaluate STkM on a recently developed collective animal behavior benchmark dataset and show that STkM outperforms baseline methods in the low-data limit, which is a critical regime of consideration in many emerging applications. Finally, we showcase how STkM can be extended to more complex machine learning tasks, particularly unsupervised region of interest detection and tracking in videos.

APJun 22, 2016
Personalized Prognostic Models for Oncology: A Machine Learning Approach

David Dooling, Angela Kim, Barbara McAneny et al.

We have applied a little-known data transformation to subsets of the Surveillance, Epidemiology, and End Results (SEER) publically available data of the National Cancer Institute (NCI) to make it suitable input to standard machine learning classifiers. This transformation properly treats the right-censored data in the SEER data and the resulting Random Forest and Multi-Layer Perceptron models predict full survival curves. Treating the 6, 12, and 60 months points of the resulting survival curves as 3 binary classifiers, the 18 resulting classifiers have AUC values ranging from .765 to .885. Further evidence that the models have generalized well from the training data is provided by the extremely high levels of agreement between the random forest and neural network models predictions on the 6, 12, and 60 month binary classifiers.