Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation
This addresses the tedious manual process of crafting negative prompts for users of text-to-image models, though it is an incremental improvement on existing techniques.
The paper tackles the problem of manually creating negative prompts for text-to-image generation by proposing NegOpt, a method that optimizes negative prompt generation using supervised fine-tuning and reinforcement learning, resulting in a 25% increase in Inception Score compared to other approaches.
In text-to-image generation, using negative prompts, which describe undesirable image characteristics, can significantly boost image quality. However, producing good negative prompts is manual and tedious. To address this, we propose NegOpt, a novel method for optimizing negative prompt generation toward enhanced image generation, using supervised fine-tuning and reinforcement learning. Our combined approach results in a substantial increase of 25% in Inception Score compared to other approaches and surpasses ground-truth negative prompts from the test set. Furthermore, with NegOpt we can preferentially optimize the metrics most important to us. Finally, we construct Negative Prompts DB (https://huggingface.co/datasets/mikeogezi/negopt_full), a publicly available dataset of negative prompts.