ComicScene154: A Scene Dataset for Comic Analysis
This dataset addresses a problem for researchers in multimodal narrative understanding and the NLP community by offering a new resource for comic analysis, though it is incremental as it builds upon existing multimodal data concepts.
The authors tackled the lack of computational resources for analyzing comics by introducing ComicScene154, a manually annotated dataset of scene-level narrative arcs from public-domain comic books, and provided a baseline scene segmentation pipeline as an initial benchmark.
Comics offer a compelling yet under-explored domain for computational narrative analysis, combining text and imagery in ways distinct from purely textual or audiovisual media. We introduce ComicScene154, a manually annotated dataset of scene-level narrative arcs derived from public-domain comic books spanning diverse genres. By conceptualizing comics as an abstraction for narrative-driven, multimodal data, we highlight their potential to inform broader research on multi-modal storytelling. To demonstrate the utility of ComicScene154, we present a baseline scene segmentation pipeline, providing an initial benchmark that future studies can build upon. Our results indicate that ComicScene154 constitutes a valuable resource for advancing computational methods in multimodal narrative understanding and expanding the scope of comic analysis within the Natural Language Processing community.