CVMar 10, 2024

Coherent Temporal Synthesis for Incremental Action Segmentation

arXiv:2403.06102v17 citationsh-index: 3CVPR
Originality Incremental advance
AI Analysis

This work addresses incremental learning for video action segmentation, a domain-specific problem, with a novel method that improves performance but is incremental in nature.

The paper tackles the problem of catastrophic forgetting in incremental action segmentation for videos by proposing a Temporally Coherent Action (TCA) model that uses a generative approach for data replay instead of storing individual frames, achieving a 22% accuracy increase over baselines in a 10-task setup on the Breakfast dataset.

Data replay is a successful incremental learning technique for images. It prevents catastrophic forgetting by keeping a reservoir of previous data, original or synthesized, to ensure the model retains past knowledge while adapting to novel concepts. However, its application in the video domain is rudimentary, as it simply stores frame exemplars for action recognition. This paper presents the first exploration of video data replay techniques for incremental action segmentation, focusing on action temporal modeling. We propose a Temporally Coherent Action (TCA) model, which represents actions using a generative model instead of storing individual frames. The integration of a conditioning variable that captures temporal coherence allows our model to understand the evolution of action features over time. Therefore, action segments generated by TCA for replay are diverse and temporally coherent. In a 10-task incremental setup on the Breakfast dataset, our approach achieves significant increases in accuracy for up to 22% compared to the baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes