MovieQA: Understanding Stories in Movies through Question-Answering
This provides a benchmark for researchers in AI and natural language processing to advance story understanding in movies, though it is incremental as it extends existing QA techniques to a new dataset.
The authors tackled the problem of evaluating automatic story comprehension in movies by introducing the MovieQA dataset, which includes 14,944 questions across 408 movies with diverse semantics and multiple information sources, and they showed that question-answering in this open-ended domain is challenging.
We introduce the MovieQA dataset which aims to evaluate automatic story comprehension from both video and text. The dataset consists of 14,944 questions about 408 movies with high semantic diversity. The questions range from simpler "Who" did "What" to "Whom", to "Why" and "How" certain events occurred. Each question comes with a set of five possible answers; a correct one and four deceiving answers provided by human annotators. Our dataset is unique in that it contains multiple sources of information -- video clips, plots, subtitles, scripts, and DVS. We analyze our data through various statistics and methods. We further extend existing QA techniques to show that question-answering with such open-ended semantics is hard. We make this data set public along with an evaluation benchmark to encourage inspiring work in this challenging domain.