CVFeb 12, 2015

Discovering Human Interactions in Videos with Limited Data Labeling

Mehran Khodabandeh, Arash Vahdat, Guang-Tong Zhou, Hossein Hajimirsadeghi, Mehrsan Javan Roshtkhari, Greg Mori, Stephen Se

arXiv:1502.03851v14.517 citations

Originality Incremental advance

AI Analysis

This work addresses the need for activity understanding in videos where labeled data is scarce, offering a practical solution for domains like surveillance or social behavior analysis, though it is incremental in its method adaptation.

The paper tackles the problem of discovering human interactions in videos without extensive labeled data by introducing an unsupervised clustering approach with user feedback, achieving perfect semantic clusters on three challenging datasets with minimal labeling effort.

We present a novel approach for discovering human interactions in videos. Activity understanding techniques usually require a large number of labeled examples, which are not available in many practical cases. Here, we focus on recovering semantically meaningful clusters of human-human and human-object interaction in an unsupervised fashion. A new iterative solution is introduced based on Maximum Margin Clustering (MMC), which also accepts user feedback to refine clusters. This is achieved by formulating the whole process as a unified constrained latent max-margin clustering problem. Extensive experiments have been carried out over three challenging datasets, Collective Activity, VIRAT, and UT-interaction. Empirical results demonstrate that the proposed algorithm can efficiently discover perfect semantic clusters of human interactions with only a small amount of labeling effort.

View on arXiv PDF

Similar