RealFusion: 360° Reconstruction of Any Object from a Single Image
This addresses the challenge of monocular 3D reconstruction for objects, enabling applications in graphics and vision, but it is incremental as it builds on prior diffusion-based methods.
The paper tackles the problem of reconstructing a full 360° photographic model of an object from a single image, achieving state-of-the-art results on benchmark images with faithful input matching and plausible extrapolation to unseen sides.
We consider the problem of reconstructing a full 360° photographic model of an object from a single image of it. We do so by fitting a neural radiance field to the image, but find this problem to be severely ill-posed. We thus take an off-the-self conditional image generator based on diffusion and engineer a prompt that encourages it to "dream up" novel views of the object. Using an approach inspired by DreamFields and DreamFusion, we fuse the given input view, the conditional prior, and other regularizers in a final, consistent reconstruction. We demonstrate state-of-the-art reconstruction results on benchmark images when compared to prior methods for monocular 3D reconstruction of objects. Qualitatively, our reconstructions provide a faithful match of the input view and a plausible extrapolation of its appearance and 3D shape, including to the side of the object not visible in the image.