AI SEJul 22, 2025

LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning

Bo Hou, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, Li Zhang

arXiv:2507.16395v23.3h-index: 5

Originality Highly original

AI Analysis

This addresses a practical problem for software developers by improving code review and maintenance through automated commit untangling, representing a strong domain-specific advancement.

The paper tackles the problem of untangling mixed changes in software commits by proposing ColaUntangle, a collaborative LLM-driven framework that models both explicit and implicit dependencies among code changes. It achieves improvements of 44% on a C# dataset and 82% on a Java dataset compared to baselines.

Atomic commits, which address a single development concern, are a best practice in software development. In practice, however, developers often produce tangled commits that mix unrelated changes, complicating code review and maintenance. Prior untangling approaches (rule-based, feature-based, or graph-based) have made progress but typically rely on shallow signals and struggle to distinguish explicit dependencies (e.g., control/data flow) from implicit ones (e.g., semantic or conceptual relationships). In this paper, we propose ColaUntangle, a new collaborative consultation framework for commit untangling that models both explicit and implicit dependencies among code changes. ColaUntangle integrates Large Language Model (LLM)-driven agents in a multi-agent architecture: one agent specializes in explicit dependencies, another in implicit ones, and a reviewer agent synthesizes their perspectives through iterative consultation. To capture structural and contextual information, we construct Explicit and Implicit Contexts, enabling agents to reason over code relationships with both symbolic and semantic depth. We evaluate ColaUntangle on two widely-used datasets (1,612 C# and 14k Java tangled commits). Experimental results show that ColaUntangle outperforms the best-performing baseline, achieving an improvement of 44% on the C# dataset and 82% on the Java dataset. These findings highlight the potential of LLM-based collaborative frameworks for advancing automated commit untangling tasks.

View on arXiv PDF

Similar