CVIRMay 27, 2019

Label Prediction Framework for Semi-Supervised Cross-Modal Retrieval

arXiv:1905.11139v11 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing reliance on large labeled datasets for cross-modal retrieval, which is incremental as it builds on existing baseline algorithms.

The paper tackles the problem of cross-modal retrieval with limited labeled data by proposing a semi-supervised framework that predicts labels for unlabeled data using complementary information from different modalities, showing significant performance improvements across three datasets.

Cross-modal data matching refers to retrieval of data from one modality, when given a query from another modality. In general, supervised algorithms achieve better retrieval performance compared to their unsupervised counterpart, as they can learn better representative features by leveraging the available label information. However, this comes at the cost of requiring huge amount of labeled examples, which may not always be available. In this work, we propose a novel framework in a semi-supervised setting, which can predict the labels of the unlabeled data using complementary information from different modalities. The proposed framework can be used as an add-on with any baseline crossmodal algorithm to give significant performance improvement, even in case of limited labeled data. Finally, we analyze the challenging scenario where the unlabeled examples can even come from classes not in the training data and evaluate the performance of our algorithm under such setting. Extensive evaluation using several baseline algorithms across three different datasets shows the effectiveness of our label prediction framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes