CVSep 26, 2023Code
Nearest Neighbor Guidance for Out-of-Distribution DetectionJaewoo Park, Yoon Gyo Jung, Andrew Beng Jin Teoh
Detecting out-of-distribution (OOD) samples are crucial for machine learning models deployed in open-world environments. Classifier-based scores are a standard approach for OOD detection due to their fine-grained detection capability. However, these scores often suffer from overconfidence issues, misclassifying OOD samples distant from the in-distribution region. To address this challenge, we propose a method called Nearest Neighbor Guidance (NNGuide) that guides the classifier-based score to respect the boundary geometry of the data manifold. NNGuide reduces the overconfidence of OOD samples while preserving the fine-grained capability of the classifier-based score. We conduct extensive experiments on ImageNet OOD detection benchmarks under diverse settings, including a scenario where the ID data undergoes natural distribution shift. Our results demonstrate that NNGuide provides a significant performance improvement on the base detection scores, achieving state-of-the-art results on both AUROC, FPR95, and AUPR metrics. The code is given at \url{https://github.com/roomo7time/nnguide}.
CVMay 26
Memory-Distilled Selection for Noise-Robust Anomaly DetectionSirojbek Safarov, Jaewoo Park, Yoon Gyo Jung et al.
Anomaly detection (AD) under data contamination is critical for deploying unsupervised defect detection in industrial environments, where curating perfectly clean training sets is impractical. However, existing methods are sensitive to contamination, suffering significant performance degradation as the noise ratio increases. In this paper, we propose Memory-Distilled Selection (MeDS), a training algorithm based on data selection. MeDS constructs an ensemble of partial memories via random subsampling, where the resulting sparsity acts as a low-pass filter that captures nominal patterns across a wide range of noise ratios, enabling coarse-level identification of contaminated samples. The aggregated distances to the bootstrapped memories are then distilled into a reconstruction score network, which is subsequently fine-tuned on clean data filtered using scores from the distilled model, enabling fine-grained localization of anomalies. MeDS is robust across a wide range of noise ratios without requiring noise-ratio-specific hyperparameter tuning, achieving 99.16\% image-level AUROC on MVTecAD at a 40\% noise ratio, and attaining state-of-the-art performance on both VisA and Real-IAD under noisy settings. We thoroughly verify the efficacy of MeDS on industrial AD benchmarks under noisy data scenarios, accompanied by in-depth empirical analyses.
CVJul 2, 2024
Face Reconstruction Transfer Attack as Out-of-Distribution GeneralizationYoon Gyo Jung, Jaewoo Park, Xingbo Dong et al.
Understanding the vulnerability of face recognition systems to malicious attacks is of critical importance. Previous works have focused on reconstructing face images that can penetrate a targeted verification system. Even in the white-box scenario, however, naively reconstructed images misrepresent the identity information, hence the attacks are easily neutralized once the face system is updated or changed. In this paper, we aim to reconstruct face images which are capable of transferring face attacks on unseen encoders. We term this problem as Face Reconstruction Transfer Attack (FRTA) and show that it can be formulated as an out-of-distribution (OOD) generalization problem. Inspired by its OOD nature, we propose to solve FRTA by Averaged Latent Search and Unsupervised Validation with pseudo target (ALSUV). To strengthen the reconstruction attack on OOD unseen encoders, ALSUV reconstructs the face by searching the latent of amortized generator StyleGAN2 through multiple latent optimization, latent optimization trajectory averaging, and unsupervised validation with a pseudo target. We demonstrate the efficacy and generalization of our method on widely used face datasets, accompanying it with extensive ablation studies and visually, qualitatively, and quantitatively analyses. The source code will be released.
CVApr 3, 2025
TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly DetectionYoon Gyo Jung, Jaewoo Park, Jaeho Yoon et al.
We aim to solve unsupervised anomaly detection in a practical challenging environment where the normal dataset is both contaminated with defective regions and its product class distribution is tailed but unknown. We observe that existing models suffer from tail-versus-noise trade-off where if a model is robust against pixel noise, then its performance deteriorates on tail class samples, and vice versa. To mitigate the issue, we handle the tail class and noise samples independently. To this end, we propose TailSampler, a novel class size predictor that estimates the class cardinality of samples based on a symmetric assumption on the class-wise distribution of embedding similarities. TailSampler can be utilized to sample the tail class samples exclusively, allowing to handle them separately. Based on these facets, we build a memory-based anomaly detection model TailedCore, whose memory both well captures tail class information and is noise-robust. We extensively validate the effectiveness of TailedCore on the unsupervised long-tail noisy anomaly detection setting, and show that TailedCore outperforms the state-of-the-art in most settings.
CVDec 12, 2020
Periocular Embedding Learning with Consistent Knowledge Distillation from FaceYoon Gyo Jung, Jaewoo Park, Cheng Yaw Low et al.
Periocular biometric, the peripheral area of the ocular, is a collaborative alternative to the face, especially when the face is occluded or masked. However, in practice, sole periocular biometric capture the least salient facial features, thereby lacking discriminative information, particularly in wild environments. To address these problems, we transfer discriminatory information from the face to support the training of a periocular network by using knowledge distillation. Specifically, we leverage face images for periocular embedding learning, but periocular alone is utilized for identity identification or verification. To enhance periocular embeddings by face effectively, we proposeConsistent Knowledge Distillation (CKD) that imposes consistency between face and periocular networks across prediction and feature layers. We find that imposing consistency at the prediction layer enables (1) extraction of global discriminative relationship information from face images and (2) effective transfer of the information from the face network to the periocular network. Particularly, consistency regularizes the prediction units to extract and store profound inter-class relationship information of face images. (3) The feature layer consistency, on the other hand, makes the periocular features robust against identity-irrelevant attributes. Overall, CKD empowers the sole periocular network to produce robust discriminative embeddings for periocular recognition in the wild. We theoretically and empirically validate the core principles of the distillation mechanism in CKD, discovering that CKD is equivalent to label smoothing with a novel sparsity-oriented regularizer that helps the network prediction to capture the global discriminative relationship. Extensive experiments reveal that CKD achieves state-of-the-art results on standard periocular recognition benchmark datasets.
LGMar 3, 2020
Discriminative Multi-level Reconstruction under Compact Latent Space for One-Class Novelty DetectionJaewoo Park, Yoon Gyo Jung, Andrew Beng Jin Teoh
In one-class novelty detection, a model learns solely on the in-class data to single out out-class instances. Autoencoder (AE) variants aim to compactly model the in-class data to reconstruct it exclusively, thus differentiating the in-class from out-class by the reconstruction error. However, compact modeling in an improper way might collapse the latent representations of the in-class data and thus their reconstruction, which would lead to performance deterioration. Moreover, to properly measure the reconstruction error of high-dimensional data, a metric is required that captures high-level semantics of the data. To this end, we propose Discriminative Compact AE (DCAE) that learns both compact and collapse-free latent representations of the in-class data, thereby reconstructing them both finely and exclusively. In DCAE, (a) we force a compact latent space to bijectively represent the in-class data by reconstructing them through internal discriminative layers of generative adversarial nets. (b) Based on the deep encoder's vulnerability to open set risk, out-class instances are encoded into the same compact latent space and reconstructed poorly without sacrificing the quality of in-class data reconstruction. (c) In inference, the reconstruction error is measured by a novel metric that computes the dissimilarity between a query and its reconstruction based on the class semantics captured by the internal discriminator. Extensive experiments on public image datasets validate the effectiveness of our proposed model on both novelty and adversarial example detection, delivering state-of-the-art performance.