Yunwei Dong

IR
3papers
1citation
Novelty48%
AI Score43

3 Papers

IRMay 22
MGRetrieval: Memory-Guided Reflective Retrieval for Long-Term Dialogue Agents

Tan Wang, Yunwei Dong

Large Language Models (LLMs) have made significant progress in dialogue, yet redundant memory contexts severely limit their effectiveness in long-term dialogue agents. External memory systems have been proposed to improve memory maintenance. However, these systems mainly rely on one-shot retrieval, which limits their ability to retrieve sufficient and relevant evidence. Although recent methods introduce reflection into retrieval, their retrieval paths are generated by the LLM from limited evidence, leading to unstable retrieval and additional latency overhead. %These limitations highlight the need for effective retrieval mechanisms. To address these limitations, we propose MGRetrieval, a retrieval strategy that grounds reflective retrieval in the semantic structure of historical memories. Specifically, MGRetrieval consists of two steps: (1) It references the structure of historical memories to construct a more precise retrieval path. (2) The LLM retains critical memories and determines whether accumulated memories are sufficient to stop further iterative retrieval. This allows the retrieval process to follow semantically meaningful paths. Through memory-guided retrieval and critical memory propagation, MGRetrieval gradually constructs concise and sufficient memory contexts. Extensive experiments on LoCoMo show that MGRetrieval outperforms the strongest baseline by 8.91\% in F1 and 11.11\% in BLEU-1 on average across Qwen2.5-14B and Qwen3-14B, while maintaining practical token and latency costs. The code can be found in https://anonymous.4open.science/r/MGRetrieval.

SEApr 3
Runtime Execution Traces Guided Automated Program Repair with Multi-Agent Debate

Jiaqing Wu, Tong Wu, Manqing Zhang et al.

Automated Program Repair (APR) struggles with complex logic errors and silent failures. Current LLM-based APR methods are mostly static, relying on source code and basic test outputs, which fail to accurately capture complex runtime behaviors and dynamic data dependencies. While incorporating runtime evidence like execution traces exposes concrete state transitions, a single LLM interpreting this in isolation often overfits to specific hypotheses, producing patches that satisfy tests by coincidence rather than correct logic. Therefore, runtime evidence should act as objective constraints rather than mere additional input. We propose TraceRepair, a multi-agent framework that leverages runtime facts as shared constraints for patch validation. A probe agent captures execution snapshots of critical variables to form an objective repair basis. Meanwhile, a committee of specialized agents cross-verifies candidate patches to expose inconsistencies and iteratively refine them. Evaluated on the Defects4J benchmark, TraceRepair correctly fixes 392 defects, substantially outperforming existing LLM-based approaches. Extensive experiments demonstrate improved efficiency and strong generalization on a newly constructed dataset of recent bugs, confirming that performance gains arise from dynamic reasoning rather than memorization.

LOApr 27
Understanding and Improving Automated Proof Synthesis for Interactive Theorem Provers

Manqing Zhang, Yunwei Dong, Lingru Zhou et al.

Formal verification using interactive theorem provers ensures high-quality software. However, writing proof scripts for interactive theorem provers is labor-intensive and requires deep expertise. Recent studies have leveraged deep learning to automate theorem proving by learning from manually written proof corpora. Nevertheless, these techniques still achieve limited success rates in proof synthesis. This paper investigates the theorems that current proof synthesis techniques are unable to prove and analyzes their characteristics. Through an in-depth analysis of the proven theorems, proof scripts, and the proof search process, we identify the limitations of existing tools and summarize the key factors influencing proof success rates. Our research provides valuable insights for the future optimization of automated proof tools. Based on our empirical study, we discover that tactic selections conforming to human expert proof patterns are more likely to lead to successful proofs. Inspired by this finding, we propose a pattern-guided tactic search (PGTS) method to heuristically enhance the search process of existing proof synthesis tools. Our evaluation experiments demonstrate that PGTS improves the number of theorems proved by existing proof synthesis tools by an average of 8.05\%, while also achieving an average 20\% increase in previously unproven theorems. Furthermore, PGTS enhances the capability of proof synthesis tools to prove complex theorems and generates more concise proof scripts.