CVLGNEJan 25, 2016

Pixel Recurrent Neural Networks

arXiv:1601.06759v32831 citations
Originality Highly original
AI Analysis

This work addresses the problem of expressive and scalable image modeling for unsupervised learning in computer vision, representing a novel method rather than an incremental improvement.

The paper tackled modeling the distribution of natural images by introducing a deep neural network that sequentially predicts pixels along spatial dimensions, achieving significantly better log-likelihood scores than previous state-of-the-art methods on natural images and providing benchmarks on ImageNet.

Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.

Code Implementations20 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes