CVJun 2, 2017

Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints

Alexander Richard, Hilde Kuehne, Juergen Gall

arXiv:1706.00699v220.340 citations

Originality Incremental advance

AI Analysis

This addresses the costly annotation problem in video analysis for researchers and practitioners by enabling easier data collection from meta-tags, though it is incremental in reducing supervision requirements.

The paper tackles the problem of weakly supervised action segmentation in videos by using only unordered action sets as supervision, eliminating the need for ordered sequences or full annotations. It achieves competitive results on three datasets despite significantly reduced supervision.

Action detection and temporal segmentation of actions in videos are topics of increasing interest. While fully supervised systems have gained much attention lately, full annotation of each action within the video is costly and impractical for large amounts of video data. Thus, weakly supervised action detection and temporal segmentation methods are of great importance. While most works in this area assume an ordered sequence of occurring actions to be given, our approach only uses a set of actions. Such action sets provide much less supervision since neither action ordering nor the number of action occurrences are known. In exchange, they can be easily obtained, for instance, from meta-tags, while ordered sequences still require human annotation. We introduce a system that automatically learns to temporally segment and label actions in a video, where the only supervision that is used are action sets. An evaluation on three datasets shows that our method still achieves good results although the amount of supervision is significantly smaller than for other related methods.

View on arXiv PDF

Similar