Gyojung Gu

h-index4

4papers

455citations

Novelty41%

AI Score34

Ranked #109,939 of 194,257 authors (top 57%)#36,733 in CV (top 62%)

4 Papers

31.3CVJun 28, 2022Code

High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions

Sangyun Lee, Gyojung Gu, Sunghyun Park et al.

Image-based virtual try-on aims to synthesize an image of a person wearing a given clothing item. To solve the task, the existing methods warp the clothing item to fit the person's body and generate the segmentation map of the person wearing the item before fusing the item with the person. However, when the warping and the segmentation generation stages operate individually without information exchange, the misalignment between the warped clothes and the segmentation map occurs, which leads to the artifacts in the final image. The information disconnection also causes excessive warping near the clothing regions occluded by the body parts, so-called pixel-squeezing artifacts. To settle the issues, we propose a novel try-on condition generator as a unified module of the two stages (i.e., warping and segmentation generation stages). A newly proposed feature fusion block in the condition generator implements the information exchange, and the condition generator does not create any misalignment or pixel-squeezing artifacts. We also introduce discriminator rejection that filters out the incorrect segmentation map predictions and assures the performance of virtual try-on frameworks. Experiments on a high-resolution dataset demonstrate that our model successfully handles the misalignment and occlusion, and significantly outperforms the baselines. Code is available at https://github.com/sangyun884/HR-VITON.

8.1CVJun 17, 2022

HairFIT: Pose-Invariant Hairstyle Transfer via Flow-based Hair Alignment and Semantic-Region-Aware Inpainting

Chaeyeon Chung, Taewoo Kim, Hyelin Nam et al.

Hairstyle transfer is the task of modifying a source hairstyle to a target one. Although recent hairstyle transfer models can reflect the delicate features of hairstyles, they still have two major limitations. First, the existing methods fail to transfer hairstyles when a source and a target image have different poses (e.g., viewing direction or face size), which is prevalent in the real world. Also, the previous models generate unrealistic images when there is a non-trivial amount of regions in the source image occluded by its original hair. When modifying long hair to short hair, shoulders or backgrounds occluded by the long hair need to be inpainted. To address these issues, we propose a novel framework for pose-invariant hairstyle transfer, HairFIT. Our model consists of two stages: 1) flow-based hair alignment and 2) hair synthesis. In the hair alignment stage, we leverage a keypoint-based optical flow estimator to align a target hairstyle with a source pose. Then, we generate a final hairstyle-transferred image in the hair synthesis stage based on Semantic-region-aware Inpainting Mask (SIM) estimator. Our SIM estimator divides the occluded regions in the source image into different semantic regions to reflect their distinct features during the inpainting. To demonstrate the effectiveness of our model, we conduct quantitative and qualitative evaluations using multi-view datasets, K-hairstyle and VoxCeleb. The results indicate that HairFIT achieves a state-of-the-art performance by successfully transferring hairstyles between images of different poses, which has never been achieved before.

35.0CVDec 4, 2023Code

StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On

Jeongho Kim, Gyojung Gu, Minho Park et al.

Given a clothing image and a person image, an image-based virtual try-on aims to generate a customized image that appears natural and accurately reflects the characteristics of the clothing image. In this work, we aim to expand the applicability of the pre-trained diffusion model so that it can be utilized independently for the virtual try-on task.The main challenge is to preserve the clothing details while effectively utilizing the robust generative capability of the pre-trained model. In order to tackle these issues, we propose StableVITON, learning the semantic correspondence between the clothing and the human body within the latent space of the pre-trained diffusion model in an end-to-end manner. Our proposed zero cross-attention blocks not only preserve the clothing details by learning the semantic correspondence but also generate high-fidelity images by utilizing the inherent knowledge of the pre-trained model in the warping process. Through our proposed novel attention total variation loss and applying augmentation, we achieve the sharp attention map, resulting in a more precise representation of clothing details. StableVITON outperforms the baselines in qualitative and quantitative evaluation, showing promising quality in arbitrary person images. Our code is available at https://github.com/rlawjdghek/StableVITON.

13.5CVFeb 11, 2021

K-Hairstyle: A Large-scale Korean Hairstyle Dataset for Virtual Hair Editing and Hairstyle Classification

Taewoo Kim, Chaeyeon Chung, Sunghyun Park et al.

The hair and beauty industry is a fast-growing industry. This led to the development of various applications, such as virtual hair dyeing or hairstyle transfer, to satisfy the customer's needs. Although several hairstyle datasets are available for these applications, they often consist of a relatively small number of images with low resolution, thus limiting their performance on high-quality hair editing. In response, we introduce a novel large-scale Korean hairstyle dataset, K-hairstyle, containing 500,000 high-resolution images. In addition, K-hairstyle includes various hair attributes annotated by Korean expert hairstylists as well as hair segmentation masks. We validate the effectiveness of our dataset via several applications, such as hair dyeing, hairstyle transfer, and hairstyle classification. K-hairstyle is publicly available at https://psh01087.github.io/K-Hairstyle/.