CAP: Evaluation of Persuasive and Creative Image Generation
This addresses the challenge of evaluating implicit, creative, and persuasive qualities in advertisement images for AI and marketing applications, but is incremental as it builds on existing text-to-image evaluation frameworks.
The paper tackled the problem of evaluating advertisement image generation by introducing three metrics for Creativity, prompt Alignment, and Persuasiveness (CAP), finding that current text-to-image models struggle with these aspects, especially for implicit prompts, and proposed a method to improve them.
We address the task of advertisement image generation and introduce three evaluation metrics to assess Creativity, prompt Alignment, and Persuasiveness (CAP) in generated advertisement images. Despite recent advancements in Text-to-Image (T2I) generation and their performance in generating high-quality images for explicit descriptions, evaluating these models remains challenging. Existing evaluation methods focus largely on assessing alignment with explicit, detailed descriptions, but evaluating alignment with visually implicit prompts remains an open problem. Additionally, creativity and persuasiveness are essential qualities that enhance the effectiveness of advertisement images, yet are seldom measured. To address this, we propose three novel metrics for evaluating the creativity, alignment, and persuasiveness of generated images. Our findings reveal that current T2I models struggle with creativity, persuasiveness, and alignment when the input text is implicit messages. We further introduce a simple yet effective approach to enhance T2I models' capabilities in producing images that are better aligned, more creative, and more persuasive.