Unsupervised Action Segmentation for Instructional Videos
This addresses the challenge of segmenting instructional videos for tasks where atomic action annotations are rare, offering an unsupervised solution.
The paper tackles the problem of automatically discovering atomic actions in instructional videos without supervision, presenting an unsupervised approach based on a sequential stochastic autoregressive model that learns to represent and discover sequential relationships between actions.
In this paper we address the problem of automatically discovering atomic actions in unsupervised manner from instructional videos, which are rarely annotated with atomic actions. We present an unsupervised approach to learn atomic actions of structured human tasks from a variety of instructional videos based on a sequential stochastic autoregressive model for temporal segmentation of videos. This learns to represent and discover the sequential relationship between different atomic actions of the task, and which provides automatic and unsupervised self-labeling.