Teacher Encoder-Student Decoder Denoising Guided Segmentation Network for Anomaly Detection
This work addresses anomaly detection in visual data, an important problem for industrial inspection and security, but it is incremental as it builds on existing student-teacher frameworks.
The paper tackles visual anomaly detection by proposing PFADSeg, a model that integrates a teacher-encoder and student-decoder with denoising and multi-scale feature fusion, achieving state-of-the-art results on the MVTec AD dataset with an image-level AUC of 98.9%, pixel-level mean precision of 76.4%, and instance-level mean precision of 78.7%.
Visual anomaly detection is a highly challenging task, often categorized as a one-class classification and segmentation problem. Recent studies have demonstrated that the student-teacher (S-T) framework effectively addresses this challenge. However, most S-T frameworks rely solely on pre-trained teacher networks to guide student networks in learning multi-scale similar features, overlooking the potential of the student networks to enhance learning through multi-scale feature fusion. In this study, we propose a novel model named PFADSeg, which integrates a pre-trained teacher network, a denoising student network with multi-scale feature fusion, and a guided anomaly segmentation network into a unified framework. By adopting a unique teacher-encoder and student-decoder denoising mode, the model improves the student network's ability to learn from teacher network features. Furthermore, an adaptive feature fusion mechanism is introduced to train a self-supervised segmentation network that synthesizes anomaly masks autonomously, significantly increasing detection performance. Evaluated on the MVTec AD dataset, PFADSeg achieves state-of-the-art results with an image-level AUC of 98.9%, a pixel-level mean precision of 76.4%, and an instance-level mean precision of 78.7%.