DS IRFeb 13, 2018

Efficient Discovery of Variable-length Time Series Motifs with Large Length Range in Million Scale Time Series

arXiv:1802.04883v15.113 citations

Originality Highly original

AI Analysis

This work addresses a bottleneck in time series analysis for domains like data mining and pattern recognition, offering a more efficient solution for motif discovery in million-scale datasets.

The paper tackles the problem of efficiently discovering variable-length motifs in large time series data, where existing methods are slow for large enumeration ranges. It introduces HIME, an approximate algorithm that significantly improves scalability and detects motifs over a larger length range compared to state-of-the-art approaches.

Detecting repeated variable-length patterns, also called variable-length motifs, has received a great amount of attention in recent years. Current state-of-the-art algorithm utilizes fixed-length motif discovery algorithm as a subroutine to enumerate variable-length motifs. As a result, it may take hours or days to execute when enumeration range is large. In this work, we introduce an approximate algorithm called HierarchIcal based Motif Enumeration (HIME) to detect variable-length motifs with a large enumeration range in million-scale time series. We show in the experiments that the scalability of the proposed algorithm is significantly better than that of the state-of-the-art algorithm. Moreover, the motif length range detected by HIME is considerably larger than previous sequence-matching based approximate variable-length motif discovery approach. We demonstrate that HIME can efficiently detect meaningful variable-length motifs in long, real world time series.

View on arXiv PDF

Similar