CVJul 17, 2025

FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization

arXiv:2507.13311v11 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the need for realistic and controllable garment visualization in fashion e-commerce, offering a practical solution for personalized virtual fashion display.

The paper tackles the problem of generating personalized fashion images with diverse poses and lighting conditions from text descriptions, introducing FashionPose as a unified framework that achieves fine-grained pose synthesis and consistent relighting.

Realistic and controllable garment visualization is critical for fashion e-commerce, where users expect personalized previews under diverse poses and lighting conditions. Existing methods often rely on predefined poses, limiting semantic flexibility and illumination adaptability. To address this, we introduce FashionPose, the first unified text-to-pose-to-relighting generation framework. Given a natural language description, our method first predicts a 2D human pose, then employs a diffusion model to generate high-fidelity person images, and finally applies a lightweight relighting module, all guided by the same textual input. By replacing explicit pose annotations with text-driven conditioning, FashionPose enables accurate pose alignment, faithful garment rendering, and flexible lighting control. Experiments demonstrate fine-grained pose synthesis and efficient, consistent relighting, providing a practical solution for personalized virtual fashion display.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes