Fractals as Pre-training Datasets for Anomaly Detection and Localization
This addresses the challenge of data scarcity and privacy concerns in industrial anomaly detection by exploring synthetic pre-training datasets, though it is incremental as it builds on existing methods without achieving SOTA.
The study evaluated the performance of eight state-of-the-art anomaly detection methods pre-trained on dynamically generated fractal images, comparing them to ImageNet pre-training without fine-tuning on MVTec and VisA benchmarks, finding that ImageNet remained superior but fractals showed promise for discerning minor visual variations.
Anomaly detection is crucial in large-scale industrial manufacturing as it helps detect and localise defective parts. Pre-training feature extractors on large-scale datasets is a popular approach for this task. Stringent data security and privacy regulations and high costs and acquisition time hinder the availability and creation of such large datasets. While recent work in anomaly detection primarily focuses on the development of new methods built on such extractors, the importance of the data used for pre-training has not been studied. Therefore, we evaluated the performance of eight state-of-the-art methods pre-trained using dynamically generated fractal images on the famous benchmark datasets MVTec and VisA. In contrast to existing literature, which predominantly examines the transfer-learning capabilities of fractals, in this study, we compare models pre-trained with fractal images against those pre-trained with ImageNet, without subsequent fine-tuning. Although pre-training with ImageNet remains a clear winner, the results of fractals are promising considering that the anomaly detection task required features capable of discerning even minor visual variations. This opens up the possibility for a new research direction where feature extractors could be trained on synthetically generated abstract datasets reconciling the ever-increasing demand for data in machine learning while circumventing privacy and security concerns.