CLSep 28, 2024

SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement

arXiv:2409.19242v231 citationsh-index: 15
Originality Incremental advance
AI Analysis

This addresses the need to automate diagram creation for tutorials, presentations, and posters in scientific domains, saving time, but it is incremental as it builds on existing text-to-image and code generation methods.

The paper tackles the problem of generating accurate and visually appealing scientific diagrams from long-context academic documents, which current text-to-image models struggle with, by proposing a multi-step pipeline with a refinement strategy that significantly improves factual correctness and visual appeal, outperforming existing models on benchmarks.

Automating the creation of scientific diagrams from academic papers can significantly streamline the development of tutorials, presentations, and posters, thereby saving time and accelerating the process. Current text-to-image models struggle with generating accurate and visually appealing diagrams from long-context inputs. We propose SciDoc2Diagram, a task that extracts relevant information from scientific papers and generates diagrams, along with a benchmarking dataset, SciDoc2DiagramBench. We develop a multi-step pipeline SciDoc2Diagrammer that generates diagrams based on user intentions using intermediate code generation. We observed that initial diagram drafts were often incomplete or unfaithful to the source, leading us to develop SciDoc2Diagrammer-Multi-Aspect-Feedback (MAF), a refinement strategy that significantly enhances factual correctness and visual appeal and outperforms existing models on both automatic and human judgement.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes