Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection
This work addresses anomaly detection in machine sounds for industrial monitoring, but it is incremental as it builds on existing outlier exposure and contrastive learning methods.
The paper tackles unsupervised anomaly sound detection by proposing a data augmentation method for high-frequency information in contrastive learning, enabling the model to focus on low-frequency normal operational sounds, and it outperformed other contrastive learning methods on the DCASE 2020 Task 2 dataset.
The outlier exposure method is an effective approach to address the unsupervised anomaly sound detection problem. The key focus of this method is how to make the model learn the distribution space of normal data. Based on biological perception and data analysis, it is found that anomalous audio and noise often have higher frequencies. Therefore, we propose a data augmentation method for high-frequency information in contrastive learning. This enables the model to pay more attention to the low-frequency information of the audio, which represents the normal operational mode of the machine. We evaluated the proposed method on the DCASE 2020 Task 2. The results showed that our method outperformed other contrastive learning methods used on this dataset. We also evaluated the generalizability of our method on the DCASE 2022 Task 2 dataset.