Similarity-Dissimilarity Loss for Multi-label Supervised Contrastive Learning
It addresses a critical bottleneck in multi-label learning for applications like medical data analysis, though it appears incremental in extending contrastive learning to multi-label settings.
The paper tackles the challenge of defining multi-label relations and identifying positive samples in multi-label supervised contrastive learning by proposing a Similarity-Dissimilarity Loss, which outperforms baselines and achieves state-of-the-art performance on MIMIC-III-Full.
Supervised contrastive learning has achieved remarkable success by leveraging label information; however, determining positive samples in multi-label scenarios remains a critical challenge. In multi-label supervised contrastive learning (MSCL), multi-label relations are not yet fully defined, leading to ambiguity in identifying positive samples and formulating contrastive loss functions to construct the representation space. To address these challenges, we: (i) systematically formulate multi-label relations in MSCL, (ii) propose a novel Similarity-Dissimilarity Loss, which dynamically re-weights samples based on similarity and dissimilarity factors, (iii) further provide theoretical grounded proofs for our method through rigorous mathematical analysis that supports the formulation and effectiveness, and (iv) offer a unified form and paradigm for both single-label and multi-label supervised contrastive loss. We conduct experiments on both image and text modalities and further extend the evaluation to the medical domain. The results show that our method consistently outperforms baselines in comprehensive evaluations, demonstrating its effectiveness and robustness. Moreover, the proposed approach achieves state-of-the-art performance on MIMIC-III-Full.