T-Stars-Poster: A Framework for Product-Centric Advertising Image Design
This addresses the labor-intensive process of creating advertising images for marketers and designers, but it is incremental as it builds on existing methods like VLMs and SDXL.
The paper tackles the problem of automating advertising image design by proposing T-Stars-Poster, a product-centric framework that uses sequential stages including prompt generation, layout generation, and background image generation, and it demonstrates through experiments and A/B tests that the method produces more visually appealing images.
Creating advertising images is often a labor-intensive and time-consuming process. Can we automatically generate such images using basic product information like a product foreground image, taglines, and a target size? Existing methods mainly focus on parts of the problem and lack a comprehensive solution. To bridge this gap, we propose a novel product-centric framework for advertising image design called T-Stars-Poster. It consists of four sequential stages to highlight product foregrounds and taglines while achieving overall image aesthetics: prompt generation, layout generation, background image generation, and graphics rendering. Different expert models are designed and trained for the first three stages: First, a visual language model (VLM) generates background prompts that match the products. Next, a VLM-based layout generation model arranges the placement of product foregrounds, graphic elements (taglines and decorative underlays), and various nongraphic elements (objects from the background prompt). Following this, an SDXL-based model can simultaneously accept prompts, layouts, and foreground controls to generate images. To support T-Stars-Poster, we create two corresponding datasets with over 50,000 labeled images. Extensive experiments and online A/B tests demonstrate that T-Stars-Poster can produce more visually appealing advertising images.