CV AIAug 27, 2024

Diffusion based Semantic Outlier Generation via Nuisance Awareness for Out-of-Distribution Detection

Suhee Yoon, Sanghyu Yoon, Ye Seul Sim, Sungik Choi, Kyungeun Lee, Hye-Seung Cho, Hankook Lee, Woohyung Lim

arXiv:2408.14841v23.73 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the need for better OOD detection in machine learning by generating more realistic outliers, though it is incremental as it builds on existing synthetic OOD training methods.

The paper tackled the problem of generating challenging synthetic outliers for out-of-distribution (OOD) detection by proposing SONA, a framework that uses diffusion models to create outliers with semantic discrepancies while maintaining nuisance resemblance to in-distribution samples, resulting in an AUROC of 88% on near-OOD datasets, a 6% improvement over baselines.

Out-of-distribution (OOD) detection, which determines whether a given sample is part of the in-distribution (ID), has recently shown promising results through training with synthetic OOD datasets. Nonetheless, existing methods often produce outliers that are considerably distant from the ID, showing limited efficacy for capturing subtle distinctions between ID and OOD. To address these issues, we propose a novel framework, Semantic Outlier generation via Nuisance Awareness (SONA), which notably produces challenging outliers by directly leveraging pixel-space ID samples through diffusion models. Our approach incorporates SONA guidance, providing separate control over semantic and nuisance regions of ID samples. Thereby, the generated outliers achieve two crucial properties: (i) they present explicit semantic-discrepant information, while (ii) maintaining various levels of nuisance resemblance with ID. Furthermore, the improved OOD detector training with SONA outliers facilitates learning with a focus on semantic distinctions. Extensive experiments demonstrate the effectiveness of our framework, achieving an impressive AUROC of 88% on near-OOD datasets, which surpasses the performance of baseline methods by a significant margin of approximately 6%.

View on arXiv PDF

Similar