Manifold Modeling in Embedded Space: A Perspective for Interpreting Deep Image Prior
This provides a new perspective for interpreting DIP, which could benefit researchers in computer vision and machine learning, though it is incremental in nature.
The study tackled the problem of explaining why Deep Image Prior (DIP) works well for image restoration by proposing a method called manifold modeling in embedded space (MMES), which achieved results competitive with DIP in tasks like image completion and denoising.
Deep image prior (DIP), which utilizes a deep convolutional network (ConvNet) structure itself as an image prior, has attracted attentions in computer vision and machine learning communities. It empirically shows the effectiveness of ConvNet structure for various image restoration applications. However, why the DIP works so well is still unknown, and why convolution operation is useful for image reconstruction or enhancement is not very clear. In this study, we tackle these questions. The proposed approach is dividing the convolution into ``delay-embedding'' and ``transformation (\ie encoder-decoder)'', and proposing a simple, but essential, image/tensor modeling method which is closely related to dynamical systems and self-similarity. The proposed method named as manifold modeling in embedded space (MMES) is implemented by using a novel denoising-auto-encoder in combination with multi-way delay-embedding transform. In spite of its simplicity, the image/tensor completion, super-resolution, deconvolution, and denoising results of MMES are quite similar even competitive to DIP in our extensive experiments, and these results would help us for reinterpreting/characterizing the DIP from a perspective of ``low-dimensional patch-manifold prior''.