CVDec 22, 2024

Video Domain Incremental Learning for Human Action Recognition in Home Environments

arXiv:2412.16946v11 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the incremental problem of adapting action recognition models to diverse home settings for users, though it is incremental as it builds on continual learning techniques.

The paper tackles the challenge of catastrophic forgetting in video understanding models when adapting to new home environments, introducing a Video Domain Incremental Learning benchmark and showing that a replay-based baseline outperforms existing methods across user, scene, and hybrid domain splits.

It is significantly challenging to recognize daily human actions in homes due to the diversity and dynamic changes in unconstrained home environments. It spurs the need to continually adapt to various users and scenes. Fine-tuning current video understanding models on newly encountered domains often leads to catastrophic forgetting, where the models lose their ability to perform well on previously learned scenarios. To address this issue, we formalize the problem of Video Domain Incremental Learning (VDIL), which enables models to learn continually from different domains while maintaining a fixed set of action classes. Existing continual learning research primarily focuses on class-incremental learning, while the domain incremental learning has been largely overlooked in video understanding. In this work, we introduce a novel benchmark of domain incremental human action recognition for unconstrained home environments. We design three domain split types (user, scene, hybrid) to systematically assess the challenges posed by domain shifts in real-world home settings. Furthermore, we propose a baseline learning strategy based on replay and reservoir sampling techniques without domain labels to handle scenarios with limited memory and task agnosticism. Extensive experimental results demonstrate that our simple sampling and replay strategy outperforms most existing continual learning methods across the three proposed benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes