CLFeb 21, 2025

Textual-to-Visual Iterative Self-Verification for Slide Generation

arXiv:2502.15412v12 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses the time-consuming task of slide generation for users needing automated tools, though it appears incremental by building on existing LLM-based agents with refinement mechanisms.

The paper tackled automating presentation slide generation by decomposing it into content and layout components, introducing a textual-to-visual self-verification process, and achieved significant improvements over baselines in alignment, logical flow, visual appeal, and readability.

Generating presentation slides is a time-consuming task that urgently requires automation. Due to their limited flexibility and lack of automated refinement mechanisms, existing autonomous LLM-based agents face constraints in real-world applicability. We decompose the task of generating missing presentation slides into two key components: content generation and layout generation, aligning with the typical process of creating academic slides. First, we introduce a content generation approach that enhances coherence and relevance by incorporating context from surrounding slides and leveraging section retrieval strategies. For layout generation, we propose a textual-to-visual self-verification process using a LLM-based Reviewer + Refiner workflow, transforming complex textual layouts into intuitive visual formats. This modality transformation simplifies the task, enabling accurate and human-like review and refinement. Experiments show that our approach significantly outperforms baseline methods in terms of alignment, logical flow, visual appeal, and readability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes