LGAICVOct 20, 2022

A survey on Self Supervised learning approaches for improving Multimodal representation learning

arXiv:2210.11024v13 citationsh-index: 38
Originality Synthesis-oriented
AI Analysis

It aggregates existing methods for researchers in multimodal learning, but is incremental as it reviews rather than introduces new techniques.

This survey provides an overview of self-supervised learning approaches for multimodal representation learning, including cross-modal generation and pretraining, to address the high cost of annotating large datasets.

Recently self supervised learning has seen explosive growth and use in variety of machine learning tasks because of its ability to avoid the cost of annotating large-scale datasets. This paper gives an overview for best self supervised learning approaches for multimodal learning. The presented approaches have been aggregated by extensive study of the literature and tackle the application of self supervised learning in different ways. The approaches discussed are cross modal generation, cross modal pretraining, cyclic translation, and generating unimodal labels in self supervised fashion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes