CVDec 15, 2023

TAB: Text-Align Anomaly Backbone Model for Industrial Inspection Tasks

arXiv:2312.09480v1h-index: 12024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Incremental advance
AI Analysis

This work addresses the need for more efficient and precise anomaly detection in industrial inspection, offering a domain-specific solution that reduces reliance on extensive training data.

The authors tackled the problem of anomaly detection and localization in industrial inspection by proposing a novel framework that leverages the CLIP model to train a backbone tailored to the manufacturing domain, resulting in enhanced performance on datasets like MVTecAD, BTAD, and KSDD2, with improvements in few-shot scenarios using less training data.

In recent years, the focus on anomaly detection and localization in industrial inspection tasks has intensified. While existing studies have demonstrated impressive outcomes, they often rely heavily on extensive training datasets or robust features extracted from pre-trained models trained on diverse datasets like ImageNet. In this work, we propose a novel framework leveraging the visual-linguistic CLIP model to adeptly train a backbone model tailored to the manufacturing domain. Our approach concurrently considers visual and text-aligned embedding spaces for normal and abnormal conditions. The resulting pre-trained backbone markedly enhances performance in industrial downstream tasks, particularly in anomaly detection and localization. Notably, this improvement is substantiated through experiments conducted on multiple datasets such as MVTecAD, BTAD, and KSDD2. Furthermore, using our pre-trained backbone weights allows previous works to achieve superior performance in few-shot scenarios with less training data. The proposed anomaly backbone provides a foundation model for more precise anomaly detection and localization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes