CVLGApr 16, 2024

TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen

arXiv:2404.12407v14 citationsh-index: 40Frontiers of Computer Science
Originality Synthesis-oriented
AI Analysis

This provides a new benchmark for researchers studying pre-trained models and their limitations, but it is incremental as it focuses on dataset creation rather than methodological advancement.

The authors tackled the question of whether pre-trained models have comprehensive knowledge by creating and releasing a novel dataset of images from TV series released after 2021, which can be used for evaluating incremental learning, novel class discovery, and long-tailed learning.

The era of pre-trained models has ushered in a wealth of new insights for the machine learning community. Among the myriad of questions that arise, one of paramount importance is: 'Do pre-trained models possess comprehensive knowledge?' This paper seeks to address this crucial inquiry. In line with our objective, we have made publicly available a novel dataset comprised of images from TV series released post-2021. This dataset holds significant potential for use in various research areas, including the evaluation of incremental learning, novel class discovery, and long-tailed learning, among others. Project page: https://tv-100.github.io/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes