CV AINov 25, 2022

PIP: Positional-encoding Image Prior

Nimrod Shabtay, Eli Schwartz, Raja Giryes

arXiv:2211.14298v34.89 citationsh-index: 56Has Code

Originality Incremental advance

AI Analysis

This work addresses image and video reconstruction for computer vision applications, offering a more efficient and stable method, though it is incremental as it builds on existing DIP concepts.

The authors tackled the problem of image reconstruction by revisiting the Deep Image Prior framework, replacing convolutions with pixel-level MLPs using Fourier features, which achieved similar performance to DIP with fewer parameters and extended effectively to video tasks.

In Deep Image Prior (DIP), a Convolutional Neural Network (CNN) is fitted to map a latent space to a degraded (e.g. noisy) image but in the process learns to reconstruct the clean image. This phenomenon is attributed to CNN's internal image-prior. We revisit the DIP framework, examining it from the perspective of a neural implicit representation. Motivated by this perspective, we replace the random or learned latent with Fourier-Features (Positional Encoding). We show that thanks to the Fourier features properties, we can replace the convolution layers with simple pixel-level MLPs. We name this scheme ``Positional Encoding Image Prior" (PIP) and exhibit that it performs very similarly to DIP on various image-reconstruction tasks with much less parameters required. Additionally, we demonstrate that PIP can be easily extended to videos, where 3D-DIP struggles and suffers from instability. Code and additional examples for all tasks, including videos, are available on the project page https://nimrodshabtay.github.io/PIP/

View on arXiv PDF Code

Similar