DIME: An Online Tool for the Visual Comparison of Cross-Modal Retrieval Models
This provides a practical solution for researchers and practitioners in cross-modal retrieval to compare models more easily, though it is incremental as it builds on existing evaluation needs.
The paper tackles the challenge of quickly evaluating cross-modal retrieval models both quantitatively and qualitatively by presenting DIME, a modality-agnostic online tool that supports model comparison via a web browser GUI and doubles as an efficient dataset exploration tool.
Cross-modal retrieval relies on accurate models to retrieve relevant results for queries across modalities such as image, text, and video. In this paper, we build upon previous work by tackling the difficulty of evaluating models both quantitatively and qualitatively quickly. We present DIME (Dataset, Index, Model, Embedding), a modality-agnostic tool that handles multimodal datasets, trained models, and data preprocessors to support straightforward model comparison with a web browser graphical user interface. DIME inherently supports building modality-agnostic queryable indexes and extraction of relevant feature embeddings, and thus effectively doubles as an efficient cross-modal tool to explore and search through datasets.