CVMar 20, 2025

GraPLUS: Graph-based Placement Using Semantics for Image Composition

arXiv:2503.15761v1h-index: 18Computer Vision and Image Understanding
Originality Highly original
AI Analysis

This addresses the challenge of generating realistic image compositions for applications in computer vision and graphics, representing a strong specific gain in the domain.

The paper tackles the problem of plausible object placement in images by proposing GraPLUS, a framework that uses scene graphs and large language models to determine contextually appropriate positions, achieving 92.1% placement accuracy and outperforming state-of-the-art methods by 8.1%.

We present GraPLUS (Graph-based Placement Using Semantics), a novel framework for plausible object placement in images that leverages scene graphs and large language models. Our approach uniquely combines graph-structured scene representation with semantic understanding to determine contextually appropriate object positions. The framework employs GPT-2 to transform categorical node and edge labels into rich semantic embeddings that capture both definitional characteristics and typical spatial contexts, enabling nuanced understanding of object relationships and placement patterns. GraPLUS achieves placement accuracy of 92.1% and an FID score of 28.83 on the OPA dataset, outperforming state-of-the-art methods by 8.1% while maintaining competitive visual quality. In human evaluation studies involving 964 samples assessed by 19 participants, our method was preferred in 52.1% of cases, significantly outperforming previous approaches. The framework's key innovations include: (i) leveraging pre-trained scene graph models that transfer knowledge from other domains, (ii) edge-aware graph neural networks that process scene semantics through structured relationships, (iii) a cross-modal attention mechanism that aligns categorical embeddings with enhanced scene features, and (iv) a multiobjective training strategy incorporating semantic consistency constraints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes