MMCLHCApr 20, 2022

A Taxonomy of Prompt Modifiers for Text-To-Image Generation

arXiv:2204.13988v3169 citationsh-index: 13
Originality Synthesis-oriented
AI Analysis

It addresses the need for a conceptual framework in prompt engineering for text-to-image generation, offering incremental insights for HCI and HAI communities.

The paper identifies six types of prompt modifiers used in text-to-image generation through an ethnographic study, providing a taxonomy to help researchers and practitioners improve image generation.

Text-to-image generation has seen an explosion of interest since 2021. Today, beautiful and intriguing digital images and artworks can be synthesized from textual inputs ("prompts") with deep generative models. Online communities around text-to-image generation and AI generated art have quickly emerged. This paper identifies six types of prompt modifiers used by practitioners in the online community based on a 3-month ethnographic study. The novel taxonomy of prompt modifiers provides researchers a conceptual starting point for investigating the practice of text-to-image generation, but may also help practitioners of AI generated art improve their images. We further outline how prompt modifiers are applied in the practice of "prompt engineering." We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction (HCI). The paper concludes with a discussion of broader implications of prompt engineering from the perspective of Human-AI Interaction (HAI) in future applications beyond the use case of text-to-image generation and AI generated art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes