Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
This work addresses the challenge of 3D generation for computer vision and graphics researchers, offering a novel method that leverages existing 2D models, though it is incremental in building upon diffusion techniques.
The paper tackles the problem of generating 3D data by repurposing pretrained 2D diffusion models, achieving this by back-propagating scores through a differentiable renderer to aggregate 2D scores into 3D scores, with results demonstrated on models like Stable Diffusion.
A diffusion model learns to predict a vector field of gradients. We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field. This setup aggregates 2D scores at multiple camera viewpoints into a 3D score, and repurposes a pretrained 2D model for 3D data generation. We identify a technical challenge of distribution mismatch that arises in this application, and propose a novel estimation mechanism to resolve it. We run our algorithm on several off-the-shelf diffusion image generative models, including the recently released Stable Diffusion trained on the large-scale LAION dataset.