CVJun 30, 2022

Timestamp-Supervised Action Segmentation with Graph Convolutional Networks

arXiv:2206.15031v430 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses the problem of reducing annotation costs for video activity segmentation, but it is incremental as it builds on existing timestamp supervision methods.

The paper tackles temporal activity segmentation with timestamp supervision by introducing a graph convolutional network to generate dense framewise labels from sparse timestamps, achieving performance on par with or better than state-of-the-art methods on four public datasets.

We introduce a novel approach for temporal activity segmentation with timestamp supervision. Our main contribution is a graph convolutional network, which is learned in an end-to-end manner to exploit both frame features and connections between neighboring frames to generate dense framewise labels from sparse timestamp labels. The generated dense framewise labels can then be used to train the segmentation model. In addition, we propose a framework for alternating learning of both the segmentation model and the graph convolutional model, which first initializes and then iteratively refines the learned models. Detailed experiments on four public datasets, including 50 Salads, GTEA, Breakfast, and Desktop Assembly, show that our method is superior to the multi-layer perceptron baseline, while performing on par with or better than the state of the art in temporal activity segmentation with timestamp supervision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes