CVSep 28, 2025

Griffin: Generative Reference and Layout Guided Image Composition

arXiv:2509.23643v21 citationsh-index: 14
Originality Incremental advance
AI Analysis

This addresses the problem of finer control in image generation for users needing explicit placement guidance, though it is incremental as it builds on existing text-to-image models.

The paper tackles the challenge of achieving precise layout control in image generation by using reference images instead of text, resulting in a training-free method that enables explicit object and part-level composition with a single image per reference.

Text-to-image models have achieved a level of realism that enables the generation of highly convincing images. However, text-based control can be a limiting factor when more explicit guidance is needed. Defining both the content and its precise placement within an image is crucial for achieving finer control. In this work, we address the challenge of multi-image layout control, where the desired content is specified through images rather than text, and the model is guided on where to place each element. Our approach is training-free, requires a single image per reference, and provides explicit and simple control for object and part-level composition. We demonstrate its effectiveness across various image composition tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes