Winner-Take-All Autoencoders
This work addresses the need for efficient unsupervised learning of sparse representations in computer vision, though it appears incremental by combining existing concepts like autoencoders and winner-take-all mechanisms.
The paper tackles the problem of learning hierarchical sparse representations in an unsupervised manner by proposing winner-take-all autoencoders, which enforce lifetime and spatial sparsity in activations, and shows competitive classification performance on datasets like MNIST, CIFAR-10, and ImageNet.
In this paper, we propose a winner-take-all method for learning hierarchical sparse representations in an unsupervised fashion. We first introduce fully-connected winner-take-all autoencoders which use mini-batch statistics to directly enforce a lifetime sparsity in the activations of the hidden units. We then propose the convolutional winner-take-all autoencoder which combines the benefits of convolutional architectures and autoencoders for learning shift-invariant sparse representations. We describe a way to train convolutional autoencoders layer by layer, where in addition to lifetime sparsity, a spatial sparsity within each feature map is achieved using winner-take-all activation functions. We will show that winner-take-all autoencoders can be used to to learn deep sparse representations from the MNIST, CIFAR-10, ImageNet, Street View House Numbers and Toronto Face datasets, and achieve competitive classification performance.