CVMar 6

WMoE-CLIP: Wavelet-Enhanced Mixture-of-Experts Prompt Learning for Zero-Shot Anomaly Detection

arXiv:2603.06313v17.32 citations
Predicted impact top 66% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the challenge of detecting unseen anomalies without supervision, which is important for applications like industrial inspection and medical diagnosis, but it appears incremental as it builds on existing prompt learning and wavelet techniques.

The paper tackles the problem of zero-shot anomaly detection by addressing limitations in existing vision-language models, such as fixed prompts and spatial-only features, and achieves improved performance across 14 industrial and medical datasets.

Vision-language models have recently shown strong generalization in zero-shot anomaly detection (ZSAD), enabling the detection of unseen anomalies without task-specific supervision. However, existing approaches typically rely on fixed textual prompts, which struggle to capture complex semantics, and focus solely on spatial-domain features, limiting their ability to detect subtle anomalies. To address these challenges, we propose a wavelet-enhanced mixture-of-experts prompt learning method for ZSAD. Specifically, a variational autoencoder is employed to model global semantic representations and integrate them into prompts to enhance adaptability to diverse anomaly patterns. Wavelet decomposition extracts multi-frequency image features that dynamically refine textual embeddings through cross-modal interactions. Furthermore, a semantic-aware mixture-of-experts module is introduced to aggregate contextual information. Extensive experiments on 14 industrial and medical datasets demonstrate the effectiveness of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes