LG CVMar 3, 2025

Language-Assisted Feature Transformation for Anomaly Detection

EungGu Yun, Heonjin Ha, Yeongwoo Nam, Bryan Dongik Lee

arXiv:2503.01184v12 citationsh-index: 1ICLR

Originality Incremental advance

AI Analysis

This addresses the problem of detecting user-specific anomalies in visual data, which is incremental as it builds on existing anomaly detection methods by adding language guidance.

The paper tackles the challenge of incorporating user knowledge into anomaly detection by proposing LAFT, a feature transformation method that uses vision-language models to align visual features with user-defined requirements, enabling detection of specific anomalies with validated effectiveness on various datasets.

This paper introduces LAFT, a novel feature transformation method designed to incorporate user knowledge and preferences into anomaly detection using natural language. Accurately modeling the boundary of normality is crucial for distinguishing abnormal data, but this is often challenging due to limited data or the presence of nuisance attributes. While unsupervised methods that rely solely on data without user guidance are common, they may fail to detect anomalies of specific interest. To address this limitation, we propose Language-Assisted Feature Transformation (LAFT), which leverages the shared image-text embedding space of vision-language models to transform visual features according to user-defined requirements. Combined with anomaly detection methods, LAFT effectively aligns visual features with user preferences, allowing anomalies of interest to be detected. Extensive experiments on both toy and real-world datasets validate the effectiveness of our method.

View on arXiv PDF

Similar