AWT -- Clustering Meteorological Time Series Using an Aggregated Wavelet Tree
This work addresses clustering and outlier detection for meteorological data, specifically in urban climate applications, but it is incremental as it integrates ideas from existing K-Means algorithms.
The authors tackled the problem of clustering and outlier detection in meteorological time series by presenting the AWT algorithm, which automatically determines cluster numbers and handles large, heterogeneous data, and they applied it to crowd-sourced temperature data in Vienna to successfully detect outliers and map clusters to urban land-use characteristics.
Both clustering and outlier detection play an important role for meteorological measurements. We present the AWT algorithm, a clustering algorithm for time series data that also performs implicit outlier detection during the clustering. AWT integrates ideas of several well-known K-Means clustering algorithms. It chooses the number of clusters automatically based on a user-defined threshold parameter, and it can be used for heterogeneous meteorological input data as well as for data sets that exceed the available memory size. We apply AWT to crowd sourced 2-m temperature data with an hourly resolution from the city of Vienna to detect outliers and to investigate if the final clusters show general similarities and similarities with urban land-use characteristics. It is shown that both the outlier detection and the implicit mapping to land-use characteristic is possible with AWT which opens new possible fields of application, specifically in the rapidly evolving field of urban climate and urban weather.