CVJul 25, 2023

Fashion Matrix: Editing Photos by Just Talking

arXiv:2307.13240v13 citationsh-index: 27
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of automating fashion photo editing for commercial applications, but it is incremental as it combines existing models without introducing new core methods.

The paper tackles the challenge of using Large Language Models for fashion photo editing by developing Fashion Matrix, a hierarchical AI system that enables diverse prompt-driven tasks like garment replacement and recoloring through iterative user interactions and achieves automation in fashion editing processes.

The utilization of Large Language Models (LLMs) for the construction of AI systems has garnered significant attention across diverse fields. The extension of LLMs to the domain of fashion holds substantial commercial potential but also inherent challenges due to the intricate semantic interactions in fashion-related generation. To address this issue, we developed a hierarchical AI system called Fashion Matrix dedicated to editing photos by just talking. This system facilitates diverse prompt-driven tasks, encompassing garment or accessory replacement, recoloring, addition, and removal. Specifically, Fashion Matrix employs LLM as its foundational support and engages in iterative interactions with users. It employs a range of Semantic Segmentation Models (e.g., Grounded-SAM, MattingAnything, etc.) to delineate the specific editing masks based on user instructions. Subsequently, Visual Foundation Models (e.g., Stable Diffusion, ControlNet, etc.) are leveraged to generate edited images from text prompts and masks, thereby facilitating the automation of fashion editing processes. Experiments demonstrate the outstanding ability of Fashion Matrix to explores the collaborative potential of functionally diverse pre-trained models in the domain of fashion editing.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes