Bayesian Self-Supervised Contrastive Learning
This addresses a key challenge in self-supervised learning for improving encoder training in domains with unlabeled data, though it appears incremental as it builds on existing contrastive methods.
The paper tackles the problem of false negatives in self-supervised contrastive learning by proposing a Bayesian-based loss (BCL) that corrects bias with importance weights, achieving validated effectiveness and superiority in experiments.
Recent years have witnessed many successful applications of contrastive learning in diverse domains, yet its self-supervised version still remains many exciting challenges. As the negative samples are drawn from unlabeled datasets, a randomly selected sample may be actually a false negative to an anchor, leading to incorrect encoder training. This paper proposes a new self-supervised contrastive loss called the BCL loss that still uses random samples from the unlabeled data while correcting the resulting bias with importance weights. The key idea is to design the desired sampling distribution for sampling hard true negative samples under the Bayesian framework. The prominent advantage lies in that the desired sampling distribution is a parametric structure, with a location parameter for debiasing false negative and concentration parameter for mining hard negative, respectively. Experiments validate the effectiveness and superiority of the BCL loss.