CV AI LGJul 9, 2023

Semi Supervised Meta Learning for Spatiotemporal Learning

arXiv:2308.01916v11.5h-index: 6

Originality Synthesis-oriented

AI Analysis

This work explores incremental improvements to representation learning methods for video analysis applications.

The researchers investigated how meta-learning affects state-of-the-art masked autoencoder architectures for spatiotemporal learning, testing three approaches on video reconstruction and action classification tasks.

We approached the goal of applying meta-learning to self-supervised masked autoencoders for spatiotemporal learning in three steps. Broadly, we seek to understand the impact of applying meta-learning to existing state-of-the-art representation learning architectures. Thus, we test spatiotemporal learning through: a meta-learning architecture only, a representation learning architecture only, and an architecture applying representation learning alongside a meta learning architecture. We utilize the Memory Augmented Neural Network (MANN) architecture to apply meta-learning to our framework. Specifically, we first experiment with applying a pre-trained MAE and fine-tuning on our small-scale spatiotemporal dataset for video reconstruction tasks. Next, we experiment with training an MAE encoder and applying a classification head for action classification tasks. Finally, we experiment with applying a pre-trained MAE and fine-tune with MANN backbone for action classification tasks.

View on arXiv PDF

Similar