CVAIJan 7, 2021

Learning Grammar of Complex Activities via Deep Neural Networks

arXiv:2101.02774v1
Originality Synthesis-oriented
AI Analysis

This work aims to improve video learning models for applications like autonomous driving, which could benefit developers and researchers in computer vision.

This paper explores the theoretical underpinnings of deep neural networks for learning from video data, particularly under label constraints. It builds on prior work in video learning and proposes mechanisms to enhance model performance observations.

Motivated by the growing amount of publicly available video data on online streaming services and an increased interest in applications that analyze continuous video streams such as autonomous driving, this technical report provides a theoretical insight into deep neural networks for video learning, under label constraints. I build upon previous work in video learning for computer vision, make observations on model performance and propose further mechanisms to help improve our observations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes