Neural Spline Fields for Burst Image Fusion and Layer Separation
This addresses the challenge of burst image processing for applications like photography and computer vision by enabling versatile layer separation, though it is incremental as it builds on neural representations and flow models.
The paper tackles the problem of decomposing complex 3D scene effects from misaligned burst images by proposing a neural spline field model that jointly fuses images into a high-resolution reconstruction and separates them into transmission and obstruction layers, outperforming existing methods on tasks like occlusion removal and reflection suppression without post-processing or learned priors.
Each photo in an image burst can be considered a sample of a complex 3D scene: the product of parallax, diffuse and specular materials, scene motion, and illuminant variation. While decomposing all of these effects from a stack of misaligned images is a highly ill-conditioned task, the conventional align-and-merge burst pipeline takes the other extreme: blending them into a single image. In this work, we propose a versatile intermediate representation: a two-layer alpha-composited image plus flow model constructed with neural spline fields -- networks trained to map input coordinates to spline control points. Our method is able to, during test-time optimization, jointly fuse a burst image capture into one high-resolution reconstruction and decompose it into transmission and obstruction layers. Then, by discarding the obstruction layer, we can perform a range of tasks including seeing through occlusions, reflection suppression, and shadow removal. Validated on complex synthetic and in-the-wild captures we find that, with no post-processing steps or learned priors, our generalizable model is able to outperform existing dedicated single-image and multi-view obstruction removal approaches.