CVJul 20, 2023
Proxy Anchor-based Unsupervised Learning for Continuous Generalized Category DiscoveryHyungmin Kim, Sungho Suh, Daehwan Kim et al.
Recent advances in deep learning have significantly improved the performance of various computer vision applications. However, discovering novel categories in an incremental learning scenario remains a challenging problem due to the lack of prior knowledge about the number and nature of new categories. Existing methods for novel category discovery are limited by their reliance on labeled datasets and prior knowledge about the number of novel categories and the proportion of novel samples in the batch. To address the limitations and more accurately reflect real-world scenarios, in this paper, we propose a novel unsupervised class incremental learning approach for discovering novel categories on unlabeled sets without prior knowledge. The proposed method fine-tunes the feature extractor and proxy anchors on labeled sets, then splits samples into old and novel categories and clusters on the unlabeled dataset. Furthermore, the proxy anchors-based exemplar generates representative category vectors to mitigate catastrophic forgetting. Experimental results demonstrate that our proposed approach outperforms the state-of-the-art methods on fine-grained datasets under real-world scenarios.
CVNov 20, 2022
AI-KD: Adversarial learning and Implicit regularization for self-Knowledge DistillationHyungmin Kim, Sungho Suh, Sunghyun Baek et al.
We present a novel adversarial penalized self-knowledge distillation method, named adversarial learning and implicit regularization for self-knowledge distillation (AI-KD), which regularizes the training procedure by adversarial learning and implicit distillations. Our model not only distills the deterministic and progressive knowledge which are from the pre-trained and previous epoch predictive probabilities but also transfers the knowledge of the deterministic predictive distributions using adversarial learning. The motivation is that the self-knowledge distillation methods regularize the predictive probabilities with soft targets, but the exact distributions may be hard to predict. Our method deploys a discriminator to distinguish the distributions between the pre-trained and student models while the student model is trained to fool the discriminator in the trained procedure. Thus, the student model not only can learn the pre-trained model's predictive probabilities but also align the distributions between the pre-trained and student models. We demonstrate the effectiveness of the proposed method with network architectures on multiple datasets and show the proposed method achieves better performance than state-of-the-art methods.
CLApr 2, 2024
HyperCLOVA X Technical ReportKang Min Yoo, Jaegeun Han, Sookyo In et al.
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs.
CVApr 7, 2025
S^4M: Boosting Semi-Supervised Instance Segmentation with SAMHeeji Yoon, Heeseong Shin, Eunbeen Hong et al.
Semi-supervised instance segmentation poses challenges due to limited labeled data, causing difficulties in accurately localizing distinct object instances. Current teacher-student frameworks still suffer from performance constraints due to unreliable pseudo-label quality stemming from limited labeled data. While the Segment Anything Model (SAM) offers robust segmentation capabilities at various granularities, directly applying SAM to this task introduces challenges such as class-agnostic predictions and potential over-segmentation. To address these complexities, we carefully integrate SAM into the semi-supervised instance segmentation framework, developing a novel distillation method that effectively captures the precise localization capabilities of SAM without compromising semantic recognition. Furthermore, we incorporate pseudo-label refinement as well as a specialized data augmentation with the refined pseudo-labels, resulting in superior performance. We establish state-of-the-art performance, and provide comprehensive experiments and ablation studies to validate the effectiveness of our proposed approach.
CVNov 5, 2024
SPACE: SPAtial-aware Consistency rEgularization for anomaly detection in Industrial applicationsDaehwan Kim, Hyungmin Kim, Daun Jeong et al.
In this paper, we propose SPACE, a novel anomaly detection methodology that integrates a Feature Encoder (FE) into the structure of the Student-Teacher method. The proposed method has two key elements: Spatial Consistency regularization Loss (SCL) and Feature converter Module (FM). SCL prevents overfitting in student models by avoiding excessive imitation of the teacher model. Simultaneously, it facilitates the expansion of normal data features by steering clear of abnormal areas generated through data augmentation. This dual functionality ensures a robust boundary between normal and abnormal data. The FM prevents the learning of ambiguous information from the FE. This protects the learned features and enables more effective detection of structural and logical anomalies. Through these elements, SPACE is available to minimize the influence of the FE while integrating various data augmentations.In this study, we evaluated the proposed method on the MVTec LOCO, MVTec AD, and VisA datasets. Experimental results, through qualitative evaluation, demonstrate the superiority of detection and efficiency of each module compared to state-of-the-art methods.