CLAICVJul 13, 2021

FairyTailor: A Multimodal Generative Framework for Storytelling

arXiv:2108.04324v126 citations
Originality Incremental advance
AI Analysis

This addresses the problem of engaging and creative story generation for users, particularly in children's entertainment, though it is incremental by adding interactivity and multimodality to existing methods.

The authors tackled the challenge of open-ended multimodal storytelling by introducing FairyTailor, a human-in-the-loop framework for co-creating children's fairytales with generated text and retrieved images, resulting in a dynamic tool that enables interactive formation and sharing of stories.

Storytelling is an open-ended task that entails creative thinking and requires a constant flow of ideas. Natural language generation (NLG) for storytelling is especially challenging because it requires the generated text to follow an overall theme while remaining creative and diverse to engage the reader. In this work, we introduce a system and a web-based demo, FairyTailor, for human-in-the-loop visual story co-creation. Users can create a cohesive children's fairytale by weaving generated texts and retrieved images with their input. FairyTailor adds another modality and modifies the text generation process to produce a coherent and creative sequence of text and images. To our knowledge, this is the first dynamic tool for multimodal story generation that allows interactive co-formation of both texts and images. It allows users to give feedback on co-created stories and share their results.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes