LG MLJul 21, 2021

On the Memorization Properties of Contrastive Learning

Ildus Sadrtdinov, Nadezhda Chirkova, Ekaterina Lobacheva

arXiv:2107.10143v15.53 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the understanding of how contrastive learning methods like SimCLR memorize data, which is relevant for researchers in machine learning aiming to improve training approaches, though it appears incremental as it builds on existing memorization studies.

The paper investigates the memorization properties of SimCLR, a contrastive self-supervised learning method, comparing it to supervised learning and random labels training, finding that SimCLR's training objects and augmentations vary in complexity and that its complexity distribution resembles random labels training.

Memorization studies of deep neural networks (DNNs) help to understand what patterns and how do DNNs learn, and motivate improvements to DNN training approaches. In this work, we investigate the memorization properties of SimCLR, a widely used contrastive self-supervised learning approach, and compare them to the memorization of supervised learning and random labels training. We find that both training objects and augmentations may have different complexity in the sense of how SimCLR learns them. Moreover, we show that SimCLR is similar to random labels training in terms of the distribution of training objects complexity.

View on arXiv PDF

Similar