Unsupervised Representation Learning via Neural Activation Coding
This work addresses the problem of learning effective representations from unlabeled data for downstream applications in machine learning, offering a novel method that is incremental in improving existing approaches.
The paper tackles unsupervised representation learning by proposing neural activation coding (NAC), which maximizes mutual information between encoder activations and data to enhance nonlinear expressivity, achieving better or comparable performance on tasks like linear classification and nearest neighbor retrieval compared to baselines such as SimCLR and DistillHash.
We present neural activation coding (NAC) as a novel approach for learning deep representations from unlabeled data for downstream applications. We argue that the deep encoder should maximize its nonlinear expressivity on the data for downstream predictors to take full advantage of its representation power. To this end, NAC maximizes the mutual information between activation patterns of the encoder and the data over a noisy communication channel. We show that learning for a noise-robust activation code increases the number of distinct linear regions of ReLU encoders, hence the maximum nonlinear expressivity. More interestingly, NAC learns both continuous and discrete representations of data, which we respectively evaluate on two downstream tasks: (i) linear classification on CIFAR-10 and ImageNet-1K and (ii) nearest neighbor retrieval on CIFAR-10 and FLICKR-25K. Empirical results show that NAC attains better or comparable performance on both tasks over recent baselines including SimCLR and DistillHash. In addition, NAC pretraining provides significant benefits to the training of deep generative models. Our code is available at https://github.com/yookoon/nac.