CVDec 17, 2023

CogCartoon: Towards Practical Story Visualization

arXiv:2312.10718v17 citationsh-index: 1Int J Comput Vis
Originality Incremental advance
AI Analysis

This addresses the problem of making story visualization more practical for real-world applications, though it appears incremental as it builds on pre-trained diffusion models.

The paper tackles the impracticality of existing story visualization methods due to high data and storage demands by introducing CogCartoon, which uses character plugins (316 KB each) and guided inference to achieve flexibility, showing superiority in qualitative and quantitative studies.

The state-of-the-art methods for story visualization demonstrate a significant demand for training data and storage, as well as limited flexibility in story presentation, thereby rendering them impractical for real-world applications. We introduce CogCartoon, a practical story visualization method based on pre-trained diffusion models. To alleviate dependence on data and storage, we propose an innovative strategy of character-plugin generation that can represent a specific character as a compact 316 KB plugin by using a few training samples. To facilitate enhanced flexibility, we employ a strategy of plugin-guided and layout-guided inference, enabling users to seamlessly incorporate new characters and custom layouts into the generated image results at their convenience. We have conducted comprehensive qualitative and quantitative studies, providing compelling evidence for the superiority of CogCartoon over existing methodologies. Moreover, CogCartoon demonstrates its power in tackling challenging tasks, including long story visualization and realistic style story visualization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes