CLMar 24

PaperVoyager : Building Interactive Web with Visual Language Models

arXiv:2603.2299993.61 citationsh-index: 8
Predicted impact top 18% in CL · last 90 daysOriginality Highly original
AI Analysis

This addresses the limitation of static artifacts for technical papers involving dynamic mechanisms, offering a new interactive paradigm for scientific understanding.

The paper tackles the problem of converting research papers into executable interactive web systems, proposing an agent that processes PDFs end-to-end without human intervention, and experiments show it significantly improves quality on a benchmark of 19 papers.

Recent advances in visual language models have enabled autonomous agents for complex reasoning, tool use, and document understanding. However, existing document agents mainly transform papers into static artifacts such as summaries, webpages, or slides, which are insufficient for technical papers involving dynamic mechanisms and state transitions. In this work, we propose a Paper-to-Interactive-System Agent that converts research papers into executable interactive web systems. Given a PDF paper, the agent performs end-to-end processing without human intervention, including paper understanding, system modeling, and interactive webpage synthesis, enabling users to manipulate inputs and observe dynamic behaviors. To evaluate this task, we introduce a benchmark of 19 research papers paired with expert-built interactive systems as ground truth. We further propose PaperVoyager, a structured generation framework that explicitly models mechanisms and interaction logic during synthesis. Experiments show that PaperVoyager significantly improves the quality of generated interactive systems, offering a new paradigm for interactive scientific paper understanding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes