Towards Active Real-to-Twin Inspection: A New Paradigm for Zero-Shot Anomaly Detection
This work addresses the bottleneck of passive 2D imagery in zero-shot anomaly detection for embodied industrial inspection by enabling active, dynamic observations against CAD models.
The paper introduces Real-to-Twin Anomaly Detection, a new task for zero-shot anomaly detection in industrial inspection that uses CAD Digital Twins as references, and proposes AVATAR, which learns semantic alignment between real and digital twins to localize anomalies without defect annotations. AVATAR substantially outperforms adapted baselines and shows robustness to viewpoint variations.
The deployment of zero-shot anomaly detection (AD) in embodied industrial inspection is severely bottlenecked by its reliance on passive, fixed-viewpoint 2D imagery. Such formulations inherently fail to accommodate the active, dynamic observations required in real-world environments. To break this limitation, we introduce Real-to-Twin Anomaly Detection, a novel task that evaluates physical observations directly against geometrically matched CAD Digital Twins. To tackle this new task, we propose AVATAR, a framework designed to learn robust semantic alignment between Real and Digital Twins. By bridging benign Sim2Real domain gaps using only defect-free pairs, AVATAR effectively transforms CAD priors into dynamic, anomaly-free references. This elegant formulation enables the model to localize diverse anomalies in a zero-shot manner as unalignable deviations, eliminating the need for defect annotations. Extensive experiments demonstrate that AVATAR substantially outperforms adapted state-of-the-art baselines, exhibiting exceptional robustness to severe viewpoint variations. The code and dataset will be made publicly available.