CVMar 6, 2020

Captioning Images with Novel Objects via Online Vocabulary Expansion

arXiv:2003.03305v12.32 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of reducing data collection and retraining costs for image captioning systems when dealing with novel objects, though it appears incremental as it builds on existing models.

The paper tackles the problem of generating image captions containing novel objects without costly retraining, by proposing a method that uses word embeddings estimated from a small number of image features. The results demonstrate the effectiveness of this approach in integrating with general image-captioning models.

In this study, we introduce a low cost method for generating descriptions from images containing novel objects. Generally, constructing a model, which can explain images with novel objects, is costly because of the following: (1) collecting a large amount of data for each category, and (2) retraining the entire system. If humans see a small number of novel objects, they are able to estimate their properties by associating their appearance with known objects. Accordingly, we propose a method that can explain images with novel objects without retraining using the word embeddings of the objects estimated from only a small number of image features of the objects. The method can be integrated with general image-captioning models. The experimental results show the effectiveness of our approach.

View on arXiv PDF

Similar