CVSep 15, 2025

Do It Yourself (DIY): Modifying Images for Poems in a Zero-Shot Setting Using Weighted Prompt Manipulation

arXiv:2509.11878v11 citationsh-index: 11EMNLP
Originality Incremental advance
AI Analysis

This work addresses the need for customizable image generation for poetry interpretation, though it is incremental as it builds on existing diffusion and language models.

The paper tackles the problem of generating images for poems in a zero-shot setting by introducing a Weighted Prompt Manipulation technique that modifies attention weights and text embeddings in diffusion models, resulting in semantically richer and more contextually accurate visualizations.

Poetry is an expressive form of art that invites multiple interpretations, as readers often bring their own emotions, experiences, and cultural backgrounds into their understanding of a poem. Recognizing this, we aim to generate images for poems and improve these images in a zero-shot setting, enabling audiences to modify images as per their requirements. To achieve this, we introduce a novel Weighted Prompt Manipulation (WPM) technique, which systematically modifies attention weights and text embeddings within diffusion models. By dynamically adjusting the importance of specific words, WPM enhances or suppresses their influence in the final generated image, leading to semantically richer and more contextually accurate visualizations. Our approach exploits diffusion models and large language models (LLMs) such as GPT in conjunction with existing poetry datasets, ensuring a comprehensive and structured methodology for improved image generation in the literary domain. To the best of our knowledge, this is the first attempt at integrating weighted prompt manipulation for enhancing imagery in poetic language.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes