CVMar 31, 2018

Compare and Contrast: Learning Prominent Visual Differences

arXiv:1804.00112v24.66 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more natural human-computer interaction in vision systems by focusing on prominent differences, though it is incremental as it builds on relative attribute models.

The paper tackled the problem of identifying the most noticeable visual differences between images, which humans naturally prioritize, by introducing a model for predicting prominent differences. The model outperformed baseline methods on UT-Zap50K shoes and LFW10 faces datasets and improved tasks like image search and description generation.

Relative attribute models can compare images in terms of all detected properties or attributes, exhaustively predicting which image is fancier, more natural, and so on without any regard to ordering. However, when humans compare images, certain differences will naturally stick out and come to mind first. These most noticeable differences, or prominent differences, are likely to be described first. In addition, many differences, although present, may not be mentioned at all. In this work, we introduce and model prominent differences, a rich new functionality for comparing images. We collect instance-level annotations of most noticeable differences, and build a model trained on relative attribute features that predicts prominent differences for unseen pairs. We test our model on the challenging UT-Zap50K shoes and LFW10 faces datasets, and outperform an array of baseline methods. We then demonstrate how our prominence model improves two vision tasks, image search and description generation, enabling more natural communication between people and vision systems.

View on arXiv PDF

Similar