CVNov 25, 2021

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

arXiv:2111.13163v18.719 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in self-supervised learning for computer vision, offering an incremental improvement by enhancing semantic preservation during generation.

The paper tackles the problem of semantic degradation in self-supervised visual representation learning by proposing Semantic-aware Generation (SaGe), which uses a pre-trained evaluator to focus on semantic rather than pixel-level similarity in image generation, resulting in stronger visual representations as demonstrated on ImageNet-1K and five downstream tasks.

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features. Different from prior work that mostly focuses on pixel-level similarity between the original and generated images, we advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image. The core idea of implementing SaGe is to use an evaluator, a deep network that is pre-trained without labels, for extracting semantic-aware features. SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations. We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition, demonstrating its ability to learn stronger visual representations.

View on arXiv PDF Code

Similar