Learning Weakly-Supervised Contrastive Representations
This work addresses the challenge of improving representation learning in scenarios with limited labeled data, offering a weakly-supervised approach that is incremental in leveraging auxiliary information for better performance.
The paper tackles the problem of learning representations using weakly-supervised auxiliary information, such as hashtags, by proposing a two-stage contrastive learning approach that clusters data based on this information and then learns similar representations within clusters and dissimilar ones across clusters. The result shows that this method brings performance closer to supervised representations, outperforms other baseline methods leveraging auxiliary data, and also works well with unsupervised clusters.
We argue that a form of the valuable information provided by the auxiliary information is its implied data clustering information. For instance, considering hashtags as auxiliary information, we can hypothesize that an Instagram image will be semantically more similar with the same hashtags. With this intuition, we present a two-stage weakly-supervised contrastive learning approach. The first stage is to cluster data according to its auxiliary information. The second stage is to learn similar representations within the same cluster and dissimilar representations for data from different clusters. Our empirical experiments suggest the following three contributions. First, compared to conventional self-supervised representations, the auxiliary-information-infused representations bring the performance closer to the supervised representations, which use direct downstream labels as supervision signals. Second, our approach performs the best in most cases, when comparing our approach with other baseline representation learning methods that also leverage auxiliary data information. Third, we show that our approach also works well with unsupervised constructed clusters (e.g., no auxiliary information), resulting in a strong unsupervised representation learning approach.