CVAIApr 13, 2023

Leveraging triplet loss for unsupervised action segmentation

arXiv:2304.06403v216 citationsh-index: 18Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of segmenting actions in videos without labeled data, which is incremental as it builds on existing unsupervised approaches.

The paper tackles unsupervised action segmentation from single videos without training data by developing a deep metric learning approach with triplet loss and novel selection strategy, achieving competitive performance on two benchmark datasets.

In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively models temporal and semantic priors to discover actions in the new representational space. Under these circumstances, we successfully recover temporal boundaries in the learned action representations with higher quality compared with existing unsupervised approaches. The proposed method is evaluated on two widely used benchmark datasets for the action segmentation task and it achieves competitive performance by applying a generic clustering algorithm on the learned representations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes