CVDec 22, 2025

Emotion-Director: Bridging Affective Shortcut in Emotion-Oriented Image Generation

arXiv:2512.19479v1h-index: 6
Originality Incremental advance
AI Analysis

This work addresses a domain-specific issue in emotion-oriented image generation for applications like advertising, offering an incremental improvement by combining existing methods with novel modules.

The paper tackles the problem of affective shortcut in emotion-oriented image generation, where emotions are incorrectly approximated to semantics, by proposing Emotion-Director, a cross-modal collaboration framework that integrates visual and textual prompts to generate images beyond semantics, achieving superior performance in qualitative and quantitative experiments.

Image generation based on diffusion models has demonstrated impressive capability, motivating exploration into diverse and specialized applications. Owing to the importance of emotion in advertising, emotion-oriented image generation has attracted increasing attention. However, current emotion-oriented methods suffer from an affective shortcut, where emotions are approximated to semantics. As evidenced by two decades of research, emotion is not equivalent to semantics. To this end, we propose Emotion-Director, a cross-modal collaboration framework consisting of two modules. First, we propose a cross-Modal Collaborative diffusion model, abbreviated as MC-Diffusion. MC-Diffusion integrates visual prompts with textual prompts for guidance, enabling the generation of emotion-oriented images beyond semantics. Further, we improve the DPO optimization by a negative visual prompt, enhancing the model's sensitivity to different emotions under the same semantics. Second, we propose MC-Agent, a cross-Modal Collaborative Agent system that rewrites textual prompts to express the intended emotions. To avoid template-like rewrites, MC-Agent employs multi-agents to simulate human subjectivity toward emotions, and adopts a chain-of-concept workflow that improves the visual expressiveness of the rewritten prompts. Extensive qualitative and quantitative experiments demonstrate the superiority of Emotion-Director in emotion-oriented image generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes