Semantic Discord: Finding Unusual Local Patterns for Time Series
This addresses the difficulty of detecting local anomalies in time series for applications like monitoring and diagnostics, representing a novel method for a known bottleneck.
The paper tackles the problem of finding anomalous subsequences in time series by introducing semantic discord, which incorporates context information from larger subsequences containing anomaly candidates, resulting in an algorithm up to 3 orders of magnitude faster than brute force and significantly outperforming state-of-the-art methods in locating anomalies.
Finding anomalous subsequence in a long time series is a very important but difficult problem. Existing state-of-the-art methods have been focusing on searching for the subsequence that is the most dissimilar to the rest of the subsequences; however, they do not take into account the background patterns that contain the anomalous candidates. As a result, such approaches are likely to miss local anomalies. We introduce a new definition named \textit{semantic discord}, which incorporates the context information from larger subsequences containing the anomaly candidates. We propose an efficient algorithm with a derived lower bound that is up to 3 orders of magnitude faster than the brute force algorithm in real world data. We demonstrate that our method significantly outperforms the state-of-the-art methods in locating anomalies by extensive experiments. We further explain the interpretability of semantic discord.