CVJul 27, 2017

Learning from Video and Text via Large-Scale Discriminative Clustering

arXiv:1707.09074v145 citations
Originality Incremental advance
AI Analysis

This work addresses a scalability problem for researchers in weakly-supervised learning, but it is incremental as it builds on existing discriminative clustering methods.

The paper tackles the scalability issue of discriminative clustering by proposing an online optimization algorithm, which improves weakly supervised action recognition on 66 feature-length movies.

Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks. Such applications include person and action recognition, text-to-video alignment, object co-segmentation and colocalization in videos and images. One drawback of discriminative clustering, however, is its limited scalability. We address this issue and propose an online optimization algorithm based on the Block-Coordinate Frank-Wolfe algorithm. We apply the proposed method to the problem of weakly supervised learning of actions and actors from movies together with corresponding movie scripts. The scaling up of the learning problem to 66 feature length movies enables us to significantly improve weakly supervised action recognition.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes