IRCLMar 29, 2021

TREC 2020 Podcasts Track Overview

arXiv:2103.15953v141 citations
Originality Synthesis-oriented
AI Analysis

It addresses the need for standardized benchmarks in podcast analysis for the information retrieval and NLP communities, but is incremental as it builds on existing TREC frameworks.

The paper introduced the Podcast Track at TREC 2020 to foster research in information retrieval and NLP for podcasts, featuring segment retrieval and summarization tasks using a dataset of over 100,000 episodes, which attracted 15 teams and hundreds of new registrations.

The Podcast Track is new at the Text Retrieval Conference (TREC) in 2020. The podcast track was designed to encourage research into podcasts in the information retrieval and NLP research communities. The track consisted of two shared tasks: segment retrieval and summarization, both based on a dataset of over 100,000 podcast episodes (metadata, audio, and automatic transcripts) which was released concurrently with the track. The track generated considerable interest, attracted hundreds of new registrations to TREC and fifteen teams, mostly disjoint between search and summarization, made final submissions for assessment. Deep learning was the dominant experimental approach for both search experiments and summarization. This paper gives an overview of the tasks and the results of the participants' experiments. The track will return to TREC 2021 with the same two tasks, incorporating slight modifications in response to participant feedback.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes