CLAICVLGJun 23, 2016

Sort Story: Sorting Jumbled Images and Captions into Stories

arXiv:1606.07493v562 citations
Originality Incremental advance
AI Analysis

This addresses temporal common sense for AI tasks like QA and summarization, but it is incremental as it builds on existing sequencing approaches.

The paper tackles the problem of sequencing jumbled image-caption pairs into coherent stories, achieving strong results through ensemble-based methods that combine unary and pairwise predictions.

Temporal common sense has applications in AI tasks such as QA, multi-document summarization, and human-AI communication. We propose the task of sequencing -- given a jumbled set of aligned image-caption pairs that belong to a story, the task is to sort them such that the output sequence forms a coherent story. We present multiple approaches, via unary (position) and pairwise (order) predictions, and their ensemble-based combinations, achieving strong results on this task. We use both text-based and image-based features, which depict complementary improvements. Using qualitative examples, we demonstrate that our models have learnt interesting aspects of temporal common sense.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes