IRCVLGJul 21, 2020

Deep Learning Techniques for Future Intelligent Cross-Media Retrieval

arXiv:2008.01191v1
AI Analysis

It offers a foundational taxonomy and review for researchers working on multi-modal data retrieval, but it is incremental as a survey paper.

This paper provides a comprehensive survey of deep learning techniques for cross-media retrieval, categorizing methods and datasets to address challenges like representation, alignment, and translation, aiming to bridge the 'media gap' for researchers and developers.

With the advancement in technology and the expansion of broadcasting, cross-media retrieval has gained much attention. It plays a significant role in big data applications and consists in searching and finding data from different types of media. In this paper, we provide a novel taxonomy according to the challenges faced by multi-modal deep learning approaches in solving cross-media retrieval, namely: representation, alignment, and translation. These challenges are evaluated on deep learning (DL) based methods, which are categorized into four main groups: 1) unsupervised methods, 2) supervised methods, 3) pairwise based methods, and 4) rank based methods. Then, we present some well-known cross-media datasets used for retrieval, considering the importance of these datasets in the context in of deep learning based cross-media retrieval approaches. Moreover, we also present an extensive review of the state-of-the-art problems and its corresponding solutions for encouraging deep learning in cross-media retrieval. The fundamental objective of this work is to exploit Deep Neural Networks (DNNs) for bridging the "media gap", and provide researchers and developers with a better understanding of the underlying problems and the potential solutions of deep learning assisted cross-media retrieval. To the best of our knowledge, this is the first comprehensive survey to address cross-media retrieval under deep learning methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes