CVLGFeb 5, 2024

PFDM: Parser-Free Virtual Try-on via Diffusion Model

arXiv:2402.03047v13 citationsh-index: 8ICASSP
Originality Incremental advance
AI Analysis

This addresses the bottleneck in virtual try-on for garment shopping by eliminating the need for accurate parsers, though it appears incremental as it builds on diffusion models.

The paper tackles the problem of virtual try-on by proposing a parser-free diffusion model (PFDM) that seamlessly 'wears' garments on a target person using only two images, without relying on segmentation masks, and it outperforms state-of-the-art methods in synthesizing high-fidelity images.

Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes