STMLJun 14, 2021

Outlier detection in multivariate functional data through a contaminated mixture model

arXiv:2106.07222v214 citations
Originality Incremental advance
AI Analysis

This work addresses outlier detection in industrial sensor data, which is an incremental improvement over existing methods by eliminating the need to pre-specify outlier proportions.

The paper tackles the problem of automatically detecting abnormal sensor measurements in industrial settings by modeling them as multivariate functional data and using a contaminated mixture model to cluster data and identify outliers without needing to specify the outlier proportion. Numerical experiments show the model outperforms competitors, and it successfully detects abnormal behaviors in real-world data.

In an industrial context, the activity of sensors is recorded at a high frequency. A challenge is to automatically detect abnormal measurement behavior. Considering the sensor measures as functional data, the problem can be formulated as the detection of outliers in a multivariate functional data set. Due to the heterogeneity of this data set, the proposed contaminated mixture model both clusters the multivariate functional data into homogeneous groups and detects outliers. The main advantage of this procedure over its competitors is that it does not require to specify the proportion of outliers. Model inference is performed through an Expectation-Conditional Maximization algorithm, and the BIC is used to select the number of clusters. Numerical experiments on simulated data demonstrate the high performance achieved by the inference algorithm. In particular, the proposed model outperforms the competitors. Its application on the real data which motivated this study allows to correctly detect abnormal behaviors.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes