CVJul 10, 2020

AViD Dataset: Anonymized Videos from Diverse Countries

arXiv:2007.05515v345 citations
Originality Synthesis-oriented
AI Analysis

This provides a more globally representative dataset for action recognition, benefiting researchers and practitioners by improving model generalization, though it is incremental as it builds on existing dataset creation efforts.

The authors introduced the AViD dataset, a public collection of anonymized action videos from diverse countries, to address the statistical bias in existing datasets that limits model transferability across regions, showing it performs comparably or better for pretraining.

We introduce a new public video dataset for action recognition: Anonymized Videos from Diverse countries (AViD). Unlike existing public video datasets, AViD is a collection of action videos from many different countries. The motivation is to create a public dataset that would benefit training and pretraining of action recognition models for everybody, rather than making it useful for limited countries. Further, all the face identities in the AViD videos are properly anonymized to protect their privacy. It also is a static dataset where each video is licensed with the creative commons license. We confirm that most of the existing video datasets are statistically biased to only capture action videos from a limited number of countries. We experimentally illustrate that models trained with such biased datasets do not transfer perfectly to action videos from the other countries, and show that AViD addresses such problem. We also confirm that the new AViD dataset could serve as a good dataset for pretraining the models, performing comparably or better than prior datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes