CV QMDec 1, 2022

Navigating an Ocean of Video Data: Deep Learning for Humpback Whale Classification in YouTube Videos

arXiv:2212.00822v11.4

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of efficiently filtering large-scale social media video data for biodiversity assessments, specifically for humpback whale monitoring, but it is incremental as it applies existing deep learning methods to a new dataset.

The authors tackled the problem of automatically classifying YouTube videos as relevant or irrelevant for humpback whale encounters using a CNN-RNN architecture, achieving an average accuracy of 85.7% and F1 scores of 84.7% for irrelevant and 86.6% for relevant classes.

Image analysis technologies empowered by artificial intelligence (AI) have proved images and videos to be an opportune source of data to learn about humpback whale (Megaptera novaeangliae) population sizes and dynamics. With the advent of social media, platforms such as YouTube present an abundance of video data across spatiotemporal contexts documenting humpback whale encounters from users worldwide. In our work, we focus on automating the classification of YouTube videos as relevant or irrelevant based on whether they document a true humpback whale encounter or not via deep learning. We use a CNN-RNN architecture pretrained on the ImageNet dataset for classification of YouTube videos as relevant or irrelevant. We achieve an average 85.7% accuracy, and 84.7% (irrelevant)/ 86.6% (relevant) F1 scores using five-fold cross validation for evaluation on the dataset. We show that deep learning can be used as a time-efficient step to make social media a viable source of image and video data for biodiversity assessments.

View on arXiv PDF

Similar