SEAIApr 26

Query2Diagram: Answering Developer Queries with UML Diagrams

arXiv:2604.2381685.6Has Code
Predicted impact top 11% in SE · last 90 daysOriginality Incremental advance
AI Analysis

For software developers needing focused, on-demand documentation, this work provides a scalable approach to generate semantically relevant UML diagrams from code queries.

The paper introduces query-driven UML diagram generation, where LLMs produce diagrams that directly answer natural language questions about code, achieving the highest F1 scores and reducing defect rates below state-of-the-art LLMs.

Software documentation frequently becomes outdated or fails to exist entirely, yet developers need focused views of their codebase to understand complex systems. While automated reverse engineering tools can generate UML diagrams from code, they produce overwhelming detail without considering developer intent. We introduce query-driven UML diagram generation, where LLMs create diagrams that directly answer natural language questions about code. Unlike existing methods, our approach produces semantically focused diagrams containing only relevant elements with contextual descriptions. We fine-tune Qwen2.5-Coder-14B on a curated dataset of code files, developer queries, and corresponding diagram representations in a structured JSON format, evaluating with both automatic detection of structural defects and human assessment of semantic relevance. Results demonstrate that fine-tuning on a modest amount of manually corrected data yields dramatic improvements: our best model achieves the highest F1 scores while reducing defect rates below state-of-the-art LLMs, generating diagrams that are both structurally sound and semantically faithful to developer queries. Thus, we establish the feasibility of using LLMs for scalable contextual, on-demand documentation generation. We make our code and dataset publicly available at https://github.com/i-need-a-pencil/query2diagram.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes