CVAIFeb 24, 2025

Culture-TRIP: Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinement

arXiv:2502.16902v213 citationsh-index: 5NAACL
Originality Incremental advance
AI Analysis

This addresses a domain-specific issue for users needing culturally accurate image generation, but it is incremental as it builds on existing models like Stable Diffusion.

The paper tackles the problem of text-to-image models failing to generate appropriate images for underrepresented cultural concepts, such as 'hangari' (Korean utensil), by proposing Culture-TRIP, which refines prompts using retrieved cultural contexts and iterative evaluation, and a user survey with 66 participants shows it enhances image-prompt alignment.

Text-to-Image models, including Stable Diffusion, have significantly improved in generating images that are highly semantically aligned with the given prompts. However, existing models may fail to produce appropriate images for the cultural concepts or objects that are not well known or underrepresented in western cultures, such as `hangari' (Korean utensil). In this paper, we propose a novel approach, Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinement (Culture-TRIP), which refines the prompt in order to improve the alignment of the image with such culture nouns in text-to-image models. Our approach (1) retrieves cultural contexts and visual details related to the culture nouns in the prompt and (2) iteratively refines and evaluates the prompt based on a set of cultural criteria and large language models. The refinement process utilizes the information retrieved from Wikipedia and the Web. Our user survey, conducted with 66 participants from eight different countries demonstrates that our proposed approach enhances the alignment between the images and the prompts. In particular, C-TRIP demonstrates improved alignment between the generated images and underrepresented culture nouns. Resource can be found at https://shane3606.github.io/Culture-TRIP.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes