Logic-based Clustering and Learning for Time-Series Data
This addresses the challenge of processing large datasets for designers in domains like automotive testing, traffic analysis, and online education, though it appears incremental as it applies an existing logic-based method to new applications.
The paper tackles the data deluge problem in cyberphysical systems by using monotonic Parametric Signal Temporal Logic (PSTL) to design features for unsupervised classification of time-series data, enabling automatic clustering of similar traces with interpretable formulas.
To effectively analyze and design cyberphysical systems (CPS), designers today have to combat the data deluge problem, i.e., the burden of processing intractably large amounts of data produced by complex models and experiments. In this work, we utilize monotonic Parametric Signal Temporal Logic (PSTL) to design features for unsupervised classification of time series data. This enables using off-the-shelf machine learning tools to automatically cluster similar traces with respect to a given PSTL formula. We demonstrate how this technique produces interpretable formulas that are amenable to analysis and understanding using a few representative examples. We illustrate this with case studies related to automotive engine testing, highway traffic analysis, and auto-grading massively open online courses.