CVJul 20, 2024

Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models

arXiv:2407.14944v17 citationsh-index: 29Has Code
Originality Incremental advance
AI Analysis

It addresses the problem of automating fashion image creation for the fashion industry, but is incremental as it builds on existing generative models with prompting methods.

This work tackled generating tailored fashion images by using prompting techniques like zero-shot and few-shot learning with Large Language Models and Stable Diffusion, resulting in enhanced diversity in colors and textures as evaluated by CLIPscore and human judgment.

The advent of artificial intelligence has contributed in a groundbreaking transformation of the fashion industry, redefining creativity and innovation in unprecedented ways. This work investigates methodologies for generating tailored fashion descriptions using two distinct Large Language Models and a Stable Diffusion model for fashion image creation. Emphasizing adaptability in AI-driven fashion creativity, we depart from traditional approaches and focus on prompting techniques, such as zero-shot and few-shot learning, as well as Chain-of-Thought (CoT), which results in a variety of colors and textures, enhancing the diversity of the outputs. Central to our methodology is Retrieval-Augmented Generation (RAG), enriching models with insights from fashion sources to ensure contemporary representations. Evaluation combines quantitative metrics such as CLIPscore with qualitative human judgment, highlighting strengths in creativity, coherence, and aesthetic appeal across diverse styles. Among the participants, RAG and few-shot learning techniques are preferred for their ability to produce more relevant and appealing fashion descriptions. Our code is provided at https://github.com/georgiarg/AutoFashion.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes