Assessing pattern recognition or labeling in streams of temporal data
This work addresses the challenge of assessing pattern recognition in temporal data streams, which is critical for applications like early detection, but it appears incremental as it builds on existing editing distance methods.
The paper tackles the problem of evaluating pattern recognition or labeling in temporal data streams by proposing an editing distance approach to align labeled segments with ground truth, enabling derivation of standard evaluation measures and latency metrics for early detection applications.
In the data deluge context, pattern recognition or labeling in streams is becoming quite an essential and pressing task as data flows inside always bigger streams. The assessment of such tasks is not so easy when dealing with temporal data, namely patterns that have a duration (a beginning and an end time-stamp). This paper details an approach based on an editing distance to first align a sequence of labeled temporal segments with a ground truth sequence, and then, by back-tracing an optimal alignment path, to provide a confusion matrix at the label level. From this confusion matrix, standard evaluation measures can easily be derived as well as other measures such as the "latency" that can be quite important in (early) pattern detection applications.