LGCVMay 18, 2021

A multimodal deep learning framework for scalable content based visual media retrieval

arXiv:2105.08665v1Has Code
Originality Incremental advance
AI Analysis

This work addresses scalable retrieval for visual media, but it appears incremental as it builds on existing deep learning methods with modular improvements.

The authors tackled the problem of content-based visual media retrieval by proposing a multimodal deep learning framework that works for both images and videos, introducing an efficient comparison and filtering metric, and demonstrated its feasibility and efficiency through performance tests compared to conventional approaches.

We propose a novel, efficient, modular and scalable framework for content based visual media retrieval systems by leveraging the power of Deep Learning which is flexible to work both for images and videos conjointly and we also introduce an efficient comparison and filtering metric for retrieval. We put forward our findings from critical performance tests comparing our method to the predominant conventional approach to demonstrate the feasibility and efficiency of the proposed solution with best practices, possible improvements that may further augment the ability of retrieval architectures.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes