CVApr 18, 2019

Progressive Attention Memory Network for Movie Story Question Answering

arXiv:1904.08607v180 citations
Originality Incremental advance
AI Analysis

This work addresses movie story QA for AI systems, but it is incremental as it builds on existing methods with specific improvements.

The paper tackles movie story question answering by proposing the progressive attention memory network (PAMN), which addresses challenges in pinpointing relevant temporal parts and fusing video and subtitle modalities, achieving state-of-the-art results on MovieQA and TVQA datasets.

This paper proposes the progressive attention memory network (PAMN) for movie story question answering (QA). Movie story QA is challenging compared to VQA in two aspects: (1) pinpointing the temporal parts relevant to answer the question is difficult as the movies are typically longer than an hour, (2) it has both video and subtitle where different questions require different modality to infer the answer. To overcome these challenges, PAMN involves three main features: (1) progressive attention mechanism that utilizes cues from both question and answer to progressively prune out irrelevant temporal parts in memory, (2) dynamic modality fusion that adaptively determines the contribution of each modality for answering the current question, and (3) belief correction answering scheme that successively corrects the prediction score on each candidate answer. Experiments on publicly available benchmark datasets, MovieQA and TVQA, demonstrate that each feature contributes to our movie story QA architecture, PAMN, and improves performance to achieve the state-of-the-art result. Qualitative analysis by visualizing the inference mechanism of PAMN is also provided.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes