White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content
It tackles the problem of building a dataset for AI research on questionable content, which is incremental as it discusses considerations rather than introducing new methods or data.
This white paper addresses the challenge of creating a large labeled repository of online videos with questionable content, focusing on defining appropriate labels, designing collection and annotation processes, and mitigating annotator trauma risks, without presenting specific results or numbers.
This white paper presents a summary of the discussions regarding critical considerations to develop an extensive repository of online videos annotated with labels indicating questionable content. The main discussion points include: 1) the type of appropriate labels that will result in a valuable repository for the larger AI community; 2) how to design the collection and annotation process, as well as the distribution of the corpus to maximize its potential impact; and, 3) what actions we can take to reduce risk of trauma to annotators.