IRCLCVLGNov 28, 2022

Long-tail Cross Modal Hashing

arXiv:2211.15162v115 citationsh-index: 60
Originality Highly original
AI Analysis

This addresses the challenge of cross-modal retrieval in real-world scenarios with imbalanced data, offering a novel solution for multi-modal hashing.

The paper tackles the problem of learning hash codes for imbalanced multi-modal data with long-tail distributions, where existing methods fail to adapt, and proposes LtCMH, which significantly outperforms state-of-the-art baselines on long-tail datasets while maintaining comparable performance on balanced datasets.

Existing Cross Modal Hashing (CMH) methods are mainly designed for balanced data, while imbalanced data with long-tail distribution is more general in real-world. Several long-tail hashing methods have been proposed but they can not adapt for multi-modal data, due to the complex interplay between labels and individuality and commonality information of multi-modal data. Furthermore, CMH methods mostly mine the commonality of multi-modal data to learn hash codes, which may override tail labels encoded by the individuality of respective modalities. In this paper, we propose LtCMH (Long-tail CMH) to handle imbalanced multi-modal data. LtCMH firstly adopts auto-encoders to mine the individuality and commonality of different modalities by minimizing the dependency between the individuality of respective modalities and by enhancing the commonality of these modalities. Then it dynamically combines the individuality and commonality with direct features extracted from respective modalities to create meta features that enrich the representation of tail labels, and binaries meta features to generate hash codes. LtCMH significantly outperforms state-of-the-art baselines on long-tail datasets and holds a better (or comparable) performance on datasets with balanced labels.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes