MM CVJul 19, 2018

Few-Shot Adaptation for Multimedia Semantic Indexing

arXiv:1807.07203v14.37 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust multimedia indexing with limited labeled data, which is incremental as it builds upon existing zero-shot and few-shot learning methods.

The paper tackles the problem of semantic indexing for image and video data by proposing a few-shot adaptation framework that bridges zero-shot and supervised many-shot learning, achieving state-of-the-art results such as 35.98% Mean Average Precision on the TRECVID 2014 dataset under supervised conditions.

We propose a few-shot adaptation framework, which bridges zero-shot learning and supervised many-shot learning, for semantic indexing of image and video data. Few-shot adaptation provides robust parameter estimation with few training examples, by optimizing the parameters of zero-shot learning and supervised many-shot learning simultaneously. In this method, first we build a zero-shot detector, and then update it by using the few examples. Our experiments show the effectiveness of the proposed framework on three datasets: TRECVID Semantic Indexing 2010, 2014, and ImageNET. On the ImageNET dataset, we show that our method outperforms recent few-shot learning methods. On the TRECVID 2014 dataset, we achieve 15.19% and 35.98% in Mean Average Precision under the zero-shot condition and the supervised condition, respectively. To the best of our knowledge, these are the best results on this dataset.

View on arXiv PDF

Similar