LGCVNEFeb 8, 2016

Generating Images with Perceptual Similarity Metrics based on Deep Networks

arXiv:1602.02644v21204 citations
AI Analysis

This addresses the issue of perceptual quality in image generation for machine learning practitioners, though it is incremental as it builds on existing loss function methods.

The authors tackled the problem of over-smoothed results in image-generating models by proposing deep perceptual similarity metrics (DeePSiM) as loss functions, which compute distances between deep network features instead of in image space, leading to sharper and more natural-looking generated images in applications like autoencoders and network inversion.

Image-generating machine learning models are typically trained with loss functions based on distance in the image space. This often leads to over-smoothed results. We propose a class of loss functions, which we call deep perceptual similarity metrics (DeePSiM), that mitigate this problem. Instead of computing distances in the image space, we compute distances between image features extracted by deep neural networks. This metric better reflects perceptually similarity of images and thus leads to better results. We show three applications: autoencoder training, a modification of a variational autoencoder, and inversion of deep convolutional networks. In all cases, the generated images look sharp and resemble natural images.

Code Implementations10 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes