Towards Domain-Agnostic Contrastive Learning
This addresses the need for flexible self-supervised learning in domains lacking predefined invariances, offering an incremental improvement over existing methods.
The paper tackles the problem of domain-specific limitations in contrastive self-supervised learning by proposing DACL, a domain-agnostic method using Mixup noise to create examples across domains like tabular data, images, and graphs, which outperforms other domain-agnostic methods and enhances domain-specific approaches like SimCLR.
Despite recent success, most contrastive self-supervised learning methods are domain-specific, relying heavily on data augmentation techniques that require knowledge about a particular domain, such as image cropping and rotation. To overcome such limitation, we propose a novel domain-agnostic approach to contrastive learning, named DACL, that is applicable to domains where invariances, and thus, data augmentation techniques, are not readily available. Key to our approach is the use of Mixup noise to create similar and dissimilar examples by mixing data samples differently either at the input or hidden-state levels. To demonstrate the effectiveness of DACL, we conduct experiments across various domains such as tabular data, images, and graphs. Our results show that DACL not only outperforms other domain-agnostic noising methods, such as Gaussian-noise, but also combines well with domain-specific methods, such as SimCLR, to improve self-supervised visual representation learning. Finally, we theoretically analyze our method and show advantages over the Gaussian-noise based contrastive learning approach.