CVLGAug 14, 2019

Memory-Based Neighbourhood Embedding for Visual Recognition

arXiv:1908.04992v10.0041 citations
AI Analysis55

This addresses the challenge of improving visual recognition accuracy for applications like image search and few-shot learning, representing an incremental advance over existing methods.

The paper tackles the problem of learning discriminative image feature embeddings by proposing Memory-based Neighbourhood Embedding (MNE), which enhances CNN features using neighborhood information, and it significantly outperforms state-of-the-art methods on image search and few-shot learning tasks.

Learning discriminative image feature embeddings is of great importance to visual recognition. To achieve better feature embeddings, most current methods focus on designing different network structures or loss functions, and the estimated feature embeddings are usually only related to the input images. In this paper, we propose Memory-based Neighbourhood Embedding (MNE) to enhance a general CNN feature by considering its neighbourhood. The method aims to solve two critical problems, i.e., how to acquire more relevant neighbours in the network training and how to aggregate the neighbourhood information for a more discriminative embedding. We first augment an episodic memory module into the network, which can provide more relevant neighbours for both training and testing. Then the neighbours are organized in a tree graph with the target instance as the root node. The neighbourhood information is gradually aggregated to the root node in a bottom-up manner, and aggregation weights are supervised by the class relationships between the nodes. We apply MNE on image search and few shot learning tasks. Extensive ablation studies demonstrate the effectiveness of each component, and our method significantly outperforms the state-of-the-art approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes