CVAISep 6, 2024

Thinking Outside the BBox: Unconstrained Generative Object Compositing

arXiv:2409.04559v225 citationsh-index: 41
AI Analysis

This addresses the challenge of realistic object compositing in image editing for users by offering a more flexible and automated workflow, though it is incremental as it builds on diffusion-based methods.

The paper tackles the problem of generative object compositing by introducing an unconstrained approach that eliminates reliance on input masks, enabling the model to generate realistic object effects like shadows and reflections beyond mask boundaries and automatically place objects in natural locations and scales. It outperforms existing models in quality metrics and user studies.

Compositing an object into an image involves multiple non-trivial sub-tasks such as object placement and scaling, color/lighting harmonization, viewpoint/geometry adjustment, and shadow/reflection generation. Recent generative image compositing methods leverage diffusion models to handle multiple sub-tasks at once. However, existing models face limitations due to their reliance on masking the original object during training, which constrains their generation to the input mask. Furthermore, obtaining an accurate input mask specifying the location and scale of the object in a new image can be highly challenging. To overcome such limitations, we define a novel problem of unconstrained generative object compositing, i.e., the generation is not bounded by the mask, and train a diffusion-based model on a synthesized paired dataset. Our first-of-its-kind model is able to generate object effects such as shadows and reflections that go beyond the mask, enhancing image realism. Additionally, if an empty mask is provided, our model automatically places the object in diverse natural locations and scales, accelerating the compositing workflow. Our model outperforms existing object placement and compositing models in various quality metrics and user studies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes