Shadow Generation for Composite Image Using Diffusion model
This work addresses a specific problem in image composition for computer vision applications, but it is incremental as it builds on existing foundation models and datasets.
The paper tackles the challenge of generating realistic shadows for inserted foregrounds in composite images by adapting ControlNet and introducing intensity modulation modules, achieving superior performance on both DESOBA and DESOBAv2 datasets as demonstrated in experiments.
In the realm of image composition, generating realistic shadow for the inserted foreground remains a formidable challenge. Previous works have developed image-to-image translation models which are trained on paired training data. However, they are struggling to generate shadows with accurate shapes and intensities, hindered by data scarcity and inherent task complexity. In this paper, we resort to foundation model with rich prior knowledge of natural shadow images. Specifically, we first adapt ControlNet to our task and then propose intensity modulation modules to improve the shadow intensity. Moreover, we extend the small-scale DESOBA dataset to DESOBAv2 using a novel data acquisition pipeline. Experimental results on both DESOBA and DESOBAv2 datasets as well as real composite images demonstrate the superior capability of our model for shadow generation task. The dataset, code, and model are released at https://github.com/bcmi/Object-Shadow-Generation-Dataset-DESOBAv2.