CV LGMay 8, 2017

Deep Descriptor Transforming for Image Co-Localization

Xiu-Shen Wei, Chen-Lin Zhang, Yao Li, Chen-Wei Xie, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou

arXiv:1705.02758v17.636 citations

Originality Highly original

AI Analysis

This addresses the need for reusable model design in machine learning applications, specifically for image co-localization tasks, offering a novel approach that enhances accuracy and generalization.

The paper tackles the problem of reusing pre-trained deep convolutional models for image co-localization by proposing Deep Descriptor Transforming (DDT), which uses convolutional activations as detectors to locate common objects across images, achieving state-of-the-art performance on benchmark datasets with significant improvements.

Reusable model design becomes desirable with the rapid expansion of machine learning applications. In this paper, we focus on the reusability of pre-trained deep convolutional models. Specifically, different from treating pre-trained models as feature extractors, we reveal more treasures beneath convolutional layers, i.e., the convolutional activations could act as a detector for the common object in the image co-localization problem. We propose a simple but effective method, named Deep Descriptor Transforming (DDT), for evaluating the correlations of descriptors and then obtaining the category-consistent regions, which can accurately locate the common object in a set of images. Empirical studies validate the effectiveness of the proposed DDT method. On benchmark image co-localization datasets, DDT consistently outperforms existing state-of-the-art methods by a large margin. Moreover, DDT also demonstrates good generalization ability for unseen categories and robustness for dealing with noisy data.

View on arXiv PDF

Similar