High Fidelity Visualization of What Your Self-Supervised Representation Knows About
This work addresses the problem of interpretability in self-supervised learning for researchers, providing a novel visualization tool to analyze representation properties, though it is incremental in applying diffusion models to this specific domain.
The authors tackled the challenge of understanding what information is retained in self-supervised learning (SSL) representations by using a Representation Conditional Diffusion Model (RCDM) to visualize these representations in data space. They demonstrated that SSL backbone representations are not invariant to data augmentations, debunking a common belief, and found that SSL representations are more robust to adversarial perturbations than supervised ones.
Discovering what is learned by neural networks remains a challenge. In self-supervised learning, classification is the most common task used to evaluate how good a representation is. However, relying only on such downstream task can limit our understanding of what information is retained in the representation of a given input. In this work, we showcase the use of a Representation Conditional Diffusion Model (RCDM) to visualize in data space the representations learned by self-supervised models. The use of RCDM is motivated by its ability to generate high-quality samples -- on par with state-of-the-art generative models -- while ensuring that the representations of those samples are faithful i.e. close to the one used for conditioning. By using RCDM to analyze self-supervised models, we are able to clearly show visually that i) SSL (backbone) representation are not invariant to the data augmentations they were trained with -- thus debunking an often restated but mistaken belief; ii) SSL post-projector embeddings appear indeed invariant to these data augmentation, along with many other data symmetries; iii) SSL representations appear more robust to small adversarial perturbation of their inputs than representations trained in a supervised manner; and iv) that SSL-trained representations exhibit an inherent structure that can be explored thanks to RCDM visualization and enables image manipulation.