When Anomalies Depend on Context: Learning Conditional Compatibility for Anomaly Detection
This addresses the challenge of detecting anomalies that vary with context in real-world visual applications, representing a novel approach beyond traditional methods.
The paper tackles the problem of contextual anomaly detection in visual data, where anomalies depend on subject-context compatibility rather than intrinsic appearance, and introduces a conditional compatibility learning framework that achieves state-of-the-art performance on benchmarks like CAAD-3K, MVTec-AD, and VisA.
Anomaly detection is often formulated under the assumption that abnormality is an intrinsic property of an observation, independent of context. This assumption breaks down in many real-world settings, where the same object or action may be normal or anomalous depending on latent contextual factors (e.g., running on a track versus on a highway). We revisit \emph{contextual anomaly detection}, classically defined as context-dependent abnormality, and operationalize it in the visual domain, where anomaly labels depend on subject--context compatibility rather than intrinsic appearance. To enable systematic study of this setting, we introduce CAAD-3K, a benchmark that isolates contextual anomalies by controlling subject identity while varying context. We further propose a conditional compatibility learning framework that leverages vision--language representations to model subject--context relationships under limited supervision. Our method substantially outperforms existing approaches on CAAD-3K and achieves state-of-the-art performance on MVTec-AD and VisA, demonstrating that modeling context dependence complements traditional structural anomaly detection. Our code and dataset will be publicly released.