LGMLMar 4, 2020

Contrastive estimation reveals topic posterior information to linear models

arXiv:2003.02234v167 citations
AI Analysis

This work addresses the challenge of semi-supervised document classification for researchers and practitioners by providing a method to improve performance with limited labeled data, though it is incremental as it builds on existing contrastive learning and topic modeling techniques.

The paper tackled the problem of document classification under topic modeling assumptions by proving that contrastive learning can recover representations revealing topic posterior information to linear models, and empirically demonstrated that linear classifiers with these representations perform well with very few training examples, achieving competitive results on benchmark datasets.

Contrastive learning is an approach to representation learning that utilizes naturally occurring similar and dissimilar pairs of data points to find useful embeddings of data. In the context of document classification under topic modeling assumptions, we prove that contrastive learning is capable of recovering a representation of documents that reveals their underlying topic posterior information to linear models. We apply this procedure in a semi-supervised setup and demonstrate empirically that linear classifiers with these representations perform well in document classification tasks with very few training examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes