Context Enhancement with Reconstruction as Sequence for Unified Unsupervised Anomaly Detection
This work addresses the challenge of improving contextual awareness in anomaly detection models for applications like industrial inspection or medical imaging, representing an incremental advancement in the field.
The paper tackles the problem of insufficient contextual awareness in feature-reconstruction-based unsupervised anomaly detection, proposing a Reconstruction as Sequence (RAS) method that integrates a transformer-based RASFormer block to enhance spatial and sequential dependencies, resulting in significant performance improvements over competing methods.
Unsupervised anomaly detection (AD) aims to train robust detection models using only normal samples, while can generalize well to unseen anomalies. Recent research focuses on a unified unsupervised AD setting in which only one model is trained for all classes, i.e., n-class-one-model paradigm. Feature-reconstruction-based methods achieve state-of-the-art performance in this scenario. However, existing methods often suffer from a lack of sufficient contextual awareness, thereby compromising the quality of the reconstruction. To address this issue, we introduce a novel Reconstruction as Sequence (RAS) method, which enhances the contextual correspondence during feature reconstruction from a sequence modeling perspective. In particular, based on the transformer technique, we integrate a specialized RASFormer block into RAS. This block enables the capture of spatial relationships among different image regions and enhances sequential dependencies throughout the reconstruction process. By incorporating the RASFormer block, our RAS method achieves superior contextual awareness capabilities, leading to remarkable performance. Experimental results show that our RAS significantly outperforms competing methods, well demonstrating the effectiveness and superiority of our method. Our code is available at https://github.com/Nothingtolose9979/RAS.