CVAINov 30, 2022

From Actions to Events: A Transfer Learning Approach Using Improved Deep Belief Networks

Microsoft
arXiv:2211.17045v13 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses the computational challenges in video analysis for researchers and practitioners, but it is incremental as it builds on existing transfer learning and energy-based models.

The paper tackles the problem of expensive and time-consuming training for video analysis by proposing a transfer learning approach that maps knowledge from action recognition to event recognition using a Spectral Deep Belief Network, which processes all frames simultaneously and shows reduced computational burden compared to traditional models on HMDB-51 and UCF-101 datasets.

In the last decade, exponential data growth supplied machine learning-based algorithms' capacity and enabled their usage in daily-life activities. Additionally, such an improvement is partially explained due to the advent of deep learning techniques, i.e., stacks of simple architectures that end up in more complex models. Although both factors produce outstanding results, they also pose drawbacks regarding the learning process as training complex models over large datasets are expensive and time-consuming. Such a problem is even more evident when dealing with video analysis. Some works have considered transfer learning or domain adaptation, i.e., approaches that map the knowledge from one domain to another, to ease the training burden, yet most of them operate over individual or small blocks of frames. This paper proposes a novel approach to map the knowledge from action recognition to event recognition using an energy-based model, denoted as Spectral Deep Belief Network. Such a model can process all frames simultaneously, carrying spatial and temporal information through the learning process. The experimental results conducted over two public video dataset, the HMDB-51 and the UCF-101, depict the effectiveness of the proposed model and its reduced computational burden when compared to traditional energy-based models, such as Restricted Boltzmann Machines and Deep Belief Networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes