CV NEJul 6, 2012

Multimodal similarity-preserving hashing

Jonathan Masci, Michael M. Bronstein, Alexander A. Bronstein, Jürgen Schmidhuber

arXiv:1207.1522v1200 citations

Originality Highly original

AI Analysis

This addresses the challenge of efficient cross-modal retrieval for multimedia applications, representing an incremental advancement with a novel neural network architecture.

The paper tackles the problem of hashing multimodal data into a comparable representation space, achieving significant performance improvements over state-of-the-art hashing methods on multimedia retrieval tasks.

We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.

View on arXiv PDF

Similar