Learning to Factorize and Relight a City
This work addresses the challenge of scene decomposition for computer vision applications, offering a method for realistic image manipulation, though it appears incremental by building on intrinsic image decomposition concepts.
The paper tackles the problem of disentangling outdoor scenes into illumination and permanent factors using a learning-based framework, achieving realistic manipulation of novel images for tasks like changing lighting effects and scene geometry.
We propose a learning-based framework for disentangling outdoor scenes into temporally-varying illumination and permanent scene factors. Inspired by the classic intrinsic image decomposition, our learning signal builds upon two insights: 1) combining the disentangled factors should reconstruct the original image, and 2) the permanent factors should stay constant across multiple temporal samples of the same scene. To facilitate training, we assemble a city-scale dataset of outdoor timelapse imagery from Google Street View, where the same locations are captured repeatedly through time. This data represents an unprecedented scale of spatio-temporal outdoor imagery. We show that our learned disentangled factors can be used to manipulate novel images in realistic ways, such as changing lighting effects and scene geometry. Please visit factorize-a-city.github.io for animated results.