CV AI LGSep 26, 2025

UML-CoT: Structured Reasoning and Planning with Unified Modeling Language for Robotic Room Cleaning

arXiv:2509.22628v21 citations

Originality Incremental advance

AI Analysis

This addresses the need for more interpretable and executable reasoning in embodied AI tasks like robotic room cleaning, representing an incremental advance over prior structured CoT methods.

The paper tackles the problem of unstructured reasoning in robotic tasks by introducing UML-CoT, a framework that uses Unified Modeling Language for structured reasoning and planning, resulting in improved interpretability, planning coherence, and execution success on a new benchmark.

Chain-of-Thought (CoT) prompting improves reasoning in large language models (LLMs), but its reliance on unstructured text limits interpretability and executability in embodied tasks. Prior work has explored structured CoTs using scene or logic graphs, yet these remain fundamentally limited: they model only low-order relations, lack constructs like inheritance or behavioral abstraction, and provide no standardized semantics for sequential or conditional planning. We propose UML-CoT, a structured reasoning and planning framework that leverages Unified Modeling Language (UML) to generate symbolic CoTs and executable action plans. UML class diagrams capture compositional object semantics, while activity diagrams model procedural control flow. Our three-stage training pipeline combines supervised fine-tuning with Group Relative Policy Optimization (GRPO), including reward learning from answer-only data. We evaluate UML-CoT on MRoom-30k, a new benchmark of cluttered room-cleaning scenarios. UML-CoT outperforms unstructured CoTs in interpretability, planning coherence, and execution success, highlighting UML as a more expressive and actionable structured reasoning formalism.

View on arXiv PDF

Similar