CVAIJan 1

HarmoniAD: Harmonizing Local Structures and Global Semantics for Anomaly Detection

arXiv:2601.00327v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the problem of detecting tiny defects in industrial product quality inspection, which is incremental as it builds on existing methods like CLIP-based encoders and frequency-based filters.

The paper tackled the structure-semantics trade-off in anomaly detection for industrial inspection by proposing HarmoniAD, a frequency-guided dual-branch framework that balances fine details and global semantics, achieving state-of-the-art performance on datasets like MVTec-AD, VisA, and BTAD.

Anomaly detection is crucial in industrial product quality inspection. Failing to detect tiny defects often leads to serious consequences. Existing methods face a structure-semantics trade-off: structure-oriented models (such as frequency-based filters) are noise-sensitive, while semantics-oriented models (such as CLIP-based encoders) often miss fine details. To address this, we propose HarmoniAD, a frequency-guided dual-branch framework. Features are first extracted by the CLIP image encoder, then transformed into the frequency domain, and finally decoupled into high- and low-frequency paths for complementary modeling of structure and semantics. The high-frequency branch is equipped with a fine-grained structural attention module (FSAM) to enhance textures and edges for detecting small anomalies, while the low-frequency branch uses a global structural context module (GSCM) to capture long-range dependencies and preserve semantic consistency. Together, these branches balance fine detail and global semantics. HarmoniAD further adopts a multi-class joint training strategy, and experiments on MVTec-AD, VisA, and BTAD show state-of-the-art performance with both sensitivity and robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes