CVIRMar 15, 2017

End-to-end Binary Representation Learning via Direct Binary Embedding

arXiv:1703.04960v21 citations
Originality Highly original
AI Analysis

This addresses the need for efficient binary representation learning in computer vision, offering a novel end-to-end approach that improves performance on tasks like image retrieval and annotation.

The paper tackles the problem of learning binary representations for large-scale computer vision tasks by introducing Direct Binary Embedding (DBE), which eliminates quantization error and achieves significant superiority over state-of-the-art methods in natural object recognition, image retrieval, and image annotation.

Learning binary representation is essential to large-scale computer vision tasks. Most existing algorithms require a separate quantization constraint to learn effective hashing functions. In this work, we present Direct Binary Embedding (DBE), a simple yet very effective algorithm to learn binary representation in an end-to-end fashion. By appending an ingeniously designed DBE layer to the deep convolutional neural network (DCNN), DBE learns binary code directly from the continuous DBE layer activation without quantization error. By employing the deep residual network (ResNet) as DCNN component, DBE captures rich semantics from images. Furthermore, in the effort of handling multilabel images, we design a joint cross entropy loss that includes both softmax cross entropy and weighted binary cross entropy in consideration of the correlation and independence of labels, respectively. Extensive experiments demonstrate the significant superiority of DBE over state-of-the-art methods on tasks of natural object recognition, image retrieval and image annotation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes