Towards a Rigorous Analysis of Mutual Information in Contrastive Learning
This work addresses a foundational gap in unsupervised representation learning for researchers, though it appears incremental as it builds on existing mutual information frameworks.
The paper tackles the challenge of rigorous mutual information analysis in contrastive learning by introducing three novel methods and theorems to enhance rigor, reassessing instances like small batch size and the InfoMin principle to facilitate deeper comprehension or correct misconceptions.
Contrastive learning has emerged as a cornerstone in recent achievements of unsupervised representation learning. Its primary paradigm involves an instance discrimination task with a mutual information loss. The loss is known as InfoNCE and it has yielded vital insights into contrastive learning through the lens of mutual information analysis. However, the estimation of mutual information can prove challenging, creating a gap between the elegance of its mathematical foundation and the complexity of its estimation. As a result, drawing rigorous insights or conclusions from mutual information analysis becomes intricate. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating their capacity to facilitate deeper comprehension or to rectify pre-existing misconceptions. Specifically, we investigate small batch size, mutual information as a measure, and the InfoMin principle.