CVJan 16, 2018

Unsupervised Representation Learning with Laplacian Pyramid Auto-encoders

arXiv:1801.05278v26 citations
Originality Synthesis-oriented
AI Analysis

This work addresses feature learning for computer vision tasks, but it is incremental as it builds on existing auto-encoder methods with a straightforward modification.

The paper tackled the problem of unsupervised representation learning by proposing Laplacian pyramid auto-encoders, a modification of deep convolutional auto-encoders that uses multiple sub-networks within a Laplacian pyramid framework to reconstruct images, resulting in improved classification and reconstruction performance.

Scale-space representation has been popular in computer vision community due to its theoretical foundation. The motivation for generating a scale-space representation of a given data set originates from the basic observation that real-world objects are composed of different structures at different scales. Hence, it's reasonable to consider learning features with image pyramids generated by smoothing and down-sampling operations. In this paper we propose Laplacian pyramid auto-encoders, a straightforward modification of the deep convolutional auto-encoder architecture, for unsupervised representation learning. The method uses multiple encoding-decoding sub-networks within a Laplacian pyramid framework to reconstruct the original image and the low pass filtered images. The last layer of each encoding sub-network also connects to an encoding layer of the sub-network in the next level, which aims to reverse the process of Laplacian pyramid generation. Experimental results showed that Laplacian pyramid benefited the classification and reconstruction performance of deep auto-encoder approaches, and batch normalization is critical to get deep auto-encoders approaches to begin learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes