Neural Image Representations for Multi-Image Fusion and Layer Separation
This addresses the problem of image fusion and layer separation for computer vision applications, but it is incremental as it builds on existing neural representation methods with specific alignment strategies.
The paper tackles the problem of aligning and fusing multiple burst images with camera motion and scene changes into a single canonical view using neural image representations, achieving effective multi-frame fusion without needing a reference frame and applying it to layer separation tasks.
We propose a framework for aligning and fusing multiple images into a single view using neural image representations (NIRs), also known as implicit or coordinate-based neural representations. Our framework targets burst images that exhibit camera ego motion and potential changes in the scene. We describe different strategies for alignment depending on the nature of the scene motion -- namely, perspective planar (i.e., homography), optical flow with minimal scene change, and optical flow with notable occlusion and disocclusion. With the neural image representation, our framework effectively combines multiple inputs into a single canonical view without the need for selecting one of the images as a reference frame. We demonstrate how to use this multi-frame fusion framework for various layer separation tasks. The code and results are available at https://shnnam.github.io/research/nir.