CVMar 28, 2018

Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment

arXiv:1803.10699v1201 citations
Originality Incremental advance
AI Analysis

This addresses the problem of expensive computational costs in action segmentation for video analysis, offering a more efficient and scalable solution, though it appears incremental as it builds on existing weakly-supervised approaches.

The paper tackles weakly-supervised human action segmentation in long videos by proposing a novel framework with a Temporal Convolutional Feature Pyramid Network and Iterative Soft Boundary Assignment training strategy, achieving competitive or superior performance to state-of-the-art methods on benchmark datasets like Breakfast and Hollywood Extended.

In this work, we address the task of weakly-supervised human action segmentation in long, untrimmed videos. Recent methods have relied on expensive learning models, such as Recurrent Neural Networks (RNN) and Hidden Markov Models (HMM). However, these methods suffer from expensive computational cost, thus are unable to be deployed in large scale. To overcome the limitations, the keys to our design are efficiency and scalability. We propose a novel action modeling framework, which consists of a new temporal convolutional network, named Temporal Convolutional Feature Pyramid Network (TCFPN), for predicting frame-wise action labels, and a novel training strategy for weakly-supervised sequence modeling, named Iterative Soft Boundary Assignment (ISBA), to align action sequences and update the network in an iterative fashion. The proposed framework is evaluated on two benchmark datasets, Breakfast and Hollywood Extended, with four different evaluation metrics. Extensive experimental results show that our methods achieve competitive or superior performance to state-of-the-art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes