CV LGApr 16, 2024

TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen

Da-Wei Zhou, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan

arXiv:2404.12407v17.64 citationsh-index: 40Frontiers of Computer Science

Originality Synthesis-oriented

AI Analysis

This provides a new benchmark for researchers studying pre-trained models and their limitations, but it is incremental as it focuses on dataset creation rather than methodological advancement.

The authors tackled the question of whether pre-trained models have comprehensive knowledge by creating and releasing a novel dataset of images from TV series released after 2021, which can be used for evaluating incremental learning, novel class discovery, and long-tailed learning.

The era of pre-trained models has ushered in a wealth of new insights for the machine learning community. Among the myriad of questions that arise, one of paramount importance is: 'Do pre-trained models possess comprehensive knowledge?' This paper seeks to address this crucial inquiry. In line with our objective, we have made publicly available a novel dataset comprised of images from TV series released post-2021. This dataset holds significant potential for use in various research areas, including the evaluation of incremental learning, novel class discovery, and long-tailed learning, among others. Project page: https://tv-100.github.io/

View on arXiv PDF

Similar