Sahng-Min Yoo

CV
h-index13
5papers
66citations
Novelty43%
AI Score30

5 Papers

CVOct 19, 2020Code
Continual Unsupervised Domain Adaptation for Semantic Segmentation

Joonhyuk Kim, Sahng-Min Yoo, Gyeong-Moon Park et al.

Unsupervised Domain Adaptation (UDA) for semantic segmentation has been favorably applied to real-world scenarios in which pixel-level labels are hard to be obtained. In most of the existing UDA methods, all target data are assumed to be introduced simultaneously. Yet, the data are usually presented sequentially in the real world. Moreover, Continual UDA, which deals with more practical scenarios with multiple target domains in the continual learning setting, has not been actively explored. In this light, we propose Continual UDA for semantic segmentation based on a newly designed Expanding Target-specific Memory (ETM) framework. Our novel ETM framework contains Target-specific Memory (TM) for each target domain to alleviate catastrophic forgetting. Furthermore, a proposed Double Hinge Adversarial (DHA) loss leads the network to produce better UDA performance overall. Our design of the TM and training objectives let the semantic segmentation network adapt to the current target domain while preserving the knowledge learned on previous target domains. The model with the proposed framework outperforms other state-of-the-art models in continual learning settings on standard benchmarks such as GTA5, SYNTHIA, CityScapes, IDD, and Cross-City datasets. The source code is available at https://github.com/joonh-kim/ETM.

CVNov 1, 2024
Towards High-fidelity Head Blending with Chroma Keying for Industrial Applications

Hah Min Lew, Sahng-Min Yoo, Hyunwoo Kang et al.

We introduce an industrial Head Blending pipeline for the task of seamlessly integrating an actor's head onto a target body in digital content creation. The key challenge stems from discrepancies in head shape and hair structure, which lead to unnatural boundaries and blending artifacts. Existing methods treat foreground and background as a single task, resulting in suboptimal blending quality. To address this problem, we propose CHANGER, a novel pipeline that decouples background integration from foreground blending. By utilizing chroma keying for artifact-free background generation and introducing Head shape and long Hair augmentation ($H^2$ augmentation) to simulate a wide range of head shapes and hair styles, CHANGER improves generalization on innumerable various real-world cases. Furthermore, our Foreground Predictive Attention Transformer (FPAT) module enhances foreground blending by predicting and focusing on key head and body regions. Quantitative and qualitative evaluations on benchmark datasets demonstrate that our CHANGER outperforms state-of-the-art methods, delivering high-fidelity, industrial-grade results.

HCAug 20, 2021
Type Anywhere You Want: An Introduction to Invisible Mobile Keyboard

Sahng-Min Yoo, Ue-Hwan Kim, Yewon Hwang et al.

Contemporary soft keyboards possess limitations: the lack of physical feedback results in an increase of typos, and the interface of soft keyboards degrades the utility of the screen. To overcome these limitations, we propose an Invisible Mobile Keyboard (IMK), which lets users freely type on the desired area without any constraints. To facilitate a data-driven IMK decoding task, we have collected the most extensive text-entry dataset (approximately 2M pairs of typing positions and the corresponding characters). Additionally, we propose our baseline decoder along with a semantic typo correction mechanism based on self-attention, which decodes such unconstrained inputs with high accuracy (96.0%). Moreover, the user study reveals that the users could type faster and feel convenience and satisfaction to IMK with our decoder. Lastly, we make the source code and the dataset public to contribute to the research community.

CVMar 9, 2021
ChangeSim: Towards End-to-End Online Scene Change Detection in Industrial Indoor Environments

Jin-Man Park, Jae-Hyuk Jang, Sahng-Min Yoo et al.

We present a challenging dataset, ChangeSim, aimed at online scene change detection (SCD) and more. The data is collected in photo-realistic simulation environments with the presence of environmental non-targeted variations, such as air turbidity and light condition changes, as well as targeted object changes in industrial indoor environments. By collecting data in simulations, multi-modal sensor data and precise ground truth labels are obtainable such as the RGB image, depth image, semantic segmentation, change segmentation, camera poses, and 3D reconstructions. While the previous online SCD datasets evaluate models given well-aligned image pairs, ChangeSim also provides raw unpaired sequences that present an opportunity to develop an online SCD model in an end-to-end manner, considering both pairing and detection. Experiments show that even the latest pair-based SCD models suffer from the bottleneck of the pairing process, and it gets worse when the environment contains the non-targeted variations. Our dataset is available at http://sammica.github.io/ChangeSim/.

HCJul 31, 2019
I-Keyboard: Fully Imaginary Keyboard on Touch Devices Empowered by Deep Neural Decoder

Ue-Hwan Kim, Sahng-Min Yoo, Jong-Hwan Kim

Text-entry aims to provide an effective and efficient pathway for humans to deliver their messages to computers. With the advent of mobile computing, the recent focus of text-entry research has moved from physical keyboards to soft keyboards. Current soft keyboards, however, increase the typo rate due to lack of tactile feedback and degrade the usability of mobile devices due to their large portion on screens. To tackle these limitations, we propose a fully imaginary keyboard (I-Keyboard) with a deep neural decoder (DND). The invisibility of I-Keyboard maximizes the usability of mobile devices and DND empowered by a deep neural architecture allows users to start typing from any position on the touch screens at any angle. To the best of our knowledge, the eyes-free ten-finger typing scenario of I-Keyboard which does not necessitate both a calibration step and a predefined region for typing is first explored in this work. For the purpose of training DND, we collected the largest user data in the process of developing I-Keyboard. We verified the performance of the proposed I-Keyboard and DND by conducting a series of comprehensive simulations and experiments under various conditions. I-Keyboard showed 18.95% and 4.06% increases in typing speed (45.57 WPM) and accuracy (95.84%), respectively over the baseline.