CLNov 8, 2022

Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation

Meta AI
arXiv:2211.03940v13 citationsh-index: 23
Originality Incremental advance
AI Analysis

This addresses the time-consuming process of creating media montages for users, though it is incremental by extending simple media retrieval to multi-turn conversations.

The paper tackles the problem of cumbersome manual media montage creation by proposing task-oriented dialogs as an interactive tool, resulting in a new dataset C3 with 10k dialogs and a demo application showing feasibility.

People capture photos and videos to relive and share memories of personal significance. Recently, media montages (stories) have become a popular mode of sharing these memories due to their intuitive and powerful storytelling capabilities. However, creating such montages usually involves a lot of manual searches, clicks, and selections that are time-consuming and cumbersome, adversely affecting user experiences. To alleviate this, we propose task-oriented dialogs for montage creation as a novel interactive tool to seamlessly search, compile, and edit montages from a media collection. To the best of our knowledge, our work is the first to leverage multi-turn conversations for such a challenging application, extending the previous literature studying simple media retrieval tasks. We collect a new dataset C3 (Conversational Content Creation), comprising 10k dialogs conditioned on media montages simulated from a large media collection. We take a simulate-and-paraphrase approach to collect these dialogs to be both cost and time efficient, while drawing from natural language distribution. Our analysis and benchmarking of state-of-the-art language models showcase the multimodal challenges present in the dataset. Lastly, we present a real-world mobile demo application that shows the feasibility of the proposed work in real-world applications. Our code and data will be made publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes