MLCVLGMay 28, 2019

Discrete Infomax Codes for Supervised Representation Learning

arXiv:1905.11656v212 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient and robust representation learning for machine learning practitioners, offering incremental improvements in regularization and efficiency over existing methods.

The paper tackles learning compact discrete representations by introducing Discrete InfoMax Codes (DIMCO), which maximize mutual information between codes and labels with regularization for independence, showing that shorter codes reduce overfitting in few-shot classification and improve efficiency in memory and retrieval.

Learning compact discrete representations of data is a key task on its own or for facilitating subsequent processing of data. In this paper we present a model that produces Discrete InfoMax Codes (DIMCO); we learn a probabilistic encoder that yields k-way d-dimensional codes associated with input data. Our model's learning objective is to maximize the mutual information between codes and labels with a regularization, which enforces entries of a codeword to be as independent as possible. We show that the infomax principle also justifies previous loss functions (e.g., cross-entropy) as its special cases. Our analysis also shows that using shorter codes, as DIMCO does, reduces overfitting in the context of few-shot classification. Through experiments in various domains, we observe this implicit meta-regularization effect of DIMCO. Furthermore, we show that the codes learned by DIMCO are efficient in terms of both memory and retrieval time compared to previous methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes