Towards Efficient Pixel Labeling for Industrial Anomaly Detection and Localization
This addresses the scalability issue in industrial product inspection by reducing annotation effort for defect detection, though it is incremental as it builds on interactive segmentation and cross-modal methods.
The paper tackles the problem of inefficient pixel-level annotation for industrial anomaly detection by proposing ADClick, an interactive image segmentation algorithm that generates annotations from user clicks and text descriptions, achieving high performance (e.g., AP = 96.1% on MVTec AD) and introducing ADClick-Seg for state-of-the-art results on multi-class tasks (AP = 80.0%, PRO = 97.5%, Pixel-AUROC = 99.1%).
Industrial product inspection is often performed using Anomaly Detection (AD) frameworks trained solely on non-defective samples. Although defective samples can be collected during production, leveraging them usually requires pixel-level annotations, limiting scalability. To address this, we propose ADClick, an Interactive Image Segmentation (IIS) algorithm for industrial anomaly detection. ADClick generates pixel-wise anomaly annotations from only a few user clicks and a brief textual description, enabling precise and efficient labeling that significantly improves AD model performance (e.g., AP = 96.1\% on MVTec AD). We further introduce ADClick-Seg, a cross-modal framework that aligns visual features and textual prompts via a prototype-based approach for anomaly detection and localization. By combining pixel-level priors with language-guided cues, ADClick-Seg achieves state-of-the-art results on the challenging ``Multi-class'' AD task (AP = 80.0\%, PRO = 97.5\%, Pixel-AUROC = 99.1\% on MVTec AD).