CVJul 27, 2023

Mixture of Self-Supervised Learning

arXiv:2307.14897v1h-index: 16Has Code
Originality Incremental advance
AI Analysis

This work addresses image classification challenges by enhancing self-supervised learning, representing an incremental improvement over prior methods that used single pretext tasks.

The paper tackles the problem of improving image classification by using multiple pretext tasks in self-supervised learning, proposing a Gated Self-Supervised Learning method that employs a Mixture of Experts gating network to automatically focus on useful augmentations, resulting in performance gains across scenarios like CIFAR imbalance, adversarial perturbations, Tiny-Imagenet classification, and semi-supervised learning.

Self-supervised learning is popular method because of its ability to learn features in images without using its labels and is able to overcome limited labeled datasets used in supervised learning. Self-supervised learning works by using a pretext task which will be trained on the model before being applied to a specific task. There are some examples of pretext tasks used in self-supervised learning in the field of image recognition, namely rotation prediction, solving jigsaw puzzles, and predicting relative positions on image. Previous studies have only used one type of transformation as a pretext task. This raises the question of how it affects if more than one pretext task is used and to use a gating network to combine all pretext tasks. Therefore, we propose the Gated Self-Supervised Learning method to improve image classification which use more than one transformation as pretext task and uses the Mixture of Expert architecture as a gating network in combining each pretext task so that the model automatically can study and focus more on the most useful augmentations for classification. We test performance of the proposed method in several scenarios, namely CIFAR imbalance dataset classification, adversarial perturbations, Tiny-Imagenet dataset classification, and semi-supervised learning. Moreover, there are Grad-CAM and T-SNE analysis that are used to see the proposed method for identifying important features that influence image classification and representing data for each class and separating different classes properly. Our code is in https://github.com/aristorenaldo/G-SSL

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes