Zhangze Chen

CL
h-index33
4papers
9citations
Novelty56%
AI Score52

4 Papers

SEMar 10Code
ToolRosetta: Bridging Open-Source Repositories and Large Language Model Agents through Automated Tool Standardization

Shimin Di, Xujie Yuan, Hanghui Guo et al.

Reusing and invoking existing code remains costly and unreliable, as most practical tools are embedded in heterogeneous code repositories and lack standardized, executable interfaces. Although large language models (LLMs) and Model Context Protocol (MCP)-based tool invocation frameworks enable natural language task execution, current approaches rely heavily on manual tool curation and standardization, which fundamentally limits scalability. In this paper, we propose ToolRosetta, a unified framework that automatically translates open-source code repositories and APIs into MCP-compatible tools that can be reliably invoked by LLMs. Given a user task, ToolRosetta autonomously plans toolchains, identifies relevant codebases, and converts them into executable MCP services, enabling end-to-end task completion with minimal human intervention. In addition, ToolRosetta incorporates a security inspection layer to mitigate risks inherent in executing arbitrary code. Extensive experiments across diverse scientific domains demonstrate that ToolRosetta can automatically standardize a large number of open-source tools and reduce the human effort required for code reproduction and deployment. Notably, by seamlessly leveraging specialized open-source tools, ToolRosetta-powered agents consistently improve task completion performance compared to commercial LLMs and existing agent systems.

CVJul 5, 2025Code
Consistent and Invariant Generalization Learning for Short-video Misinformation Detection

Hanghui Guo, Weijie Shi, Mengze Li et al.

Short-video misinformation detection has attracted wide attention in the multi-modal domain, aiming to accurately identify the misinformation in the video format accompanied by the corresponding audio. Despite significant advancements, current models in this field, trained on particular domains (source domains), often exhibit unsatisfactory performance on unseen domains (target domains) due to domain gaps. To effectively realize such domain generalization on the short-video misinformation detection task, we propose deep insights into the characteristics of different domains: (1) The detection on various domains may mainly rely on different modalities (i.e., mainly focusing on videos or audios). To enhance domain generalization, it is crucial to achieve optimal model performance on all modalities simultaneously. (2) For some domains focusing on cross-modal joint fraud, a comprehensive analysis relying on cross-modal fusion is necessary. However, domain biases located in each modality (especially in each frame of videos) will be accumulated in this fusion process, which may seriously damage the final identification of misinformation. To address these issues, we propose a new DOmain generalization model via ConsisTency and invariance learning for shORt-video misinformation detection (named DOCTOR), which contains two characteristic modules: (1) We involve the cross-modal feature interpolation to map multiple modalities into a shared space and the interpolation distillation to synchronize multi-modal learning; (2) We design the diffusion model to add noise to retain core features of multi modal and enhance domain invariant features through cross-modal guided denoising. Extensive experiments demonstrate the effectiveness of our proposed DOCTOR model. Our code is public available at https://github.com/ghh1125/DOCTOR.

CLApr 14, 2025
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation

Hanghui Guo, Jia Zhu, Shimin Di et al.

Dynamic Retrieval-augmented Generation (RAG) has shown great success in mitigating hallucinations in large language models (LLMs) during generation. However, existing dynamic RAG methods face significant limitations in two key aspects: 1) Lack of an effective mechanism to control retrieval triggers, and 2) Lack of effective scrutiny of retrieval content. To address these limitations, we propose an innovative dynamic RAG method, DioR (Adaptive Cognitive Detection and Contextual Retrieval Optimization), which consists of two main components: adaptive cognitive detection and contextual retrieval optimization, specifically designed to determine when retrieval is needed and what to retrieve for LLMs is useful. Experimental results demonstrate that DioR achieves superior performance on all tasks, demonstrating the effectiveness of our work.

LGSep 30, 2025
MuPlon: Multi-Path Causal Optimization for Claim Verification through Controlling Confounding

Hanghui Guo, Shimin Di, Pasquale De Meo et al.

As a critical task in data quality control, claim verification aims to curb the spread of misinformation by assessing the truthfulness of claims based on a wide range of evidence. However, traditional methods often overlook the complex interactions between evidence, leading to unreliable verification results. A straightforward solution represents the claim and evidence as a fully connected graph, which we define as the Claim-Evidence Graph (C-E Graph). Nevertheless, claim verification methods based on fully connected graphs face two primary confounding challenges, Data Noise and Data Biases. To address these challenges, we propose a novel framework, Multi-Path Causal Optimization (MuPlon). MuPlon integrates a dual causal intervention strategy, consisting of the back-door path and front-door path. In the back-door path, MuPlon dilutes noisy node interference by optimizing node probability weights, while simultaneously strengthening the connections between relevant evidence nodes. In the front-door path, MuPlon extracts highly relevant subgraphs and constructs reasoning paths, further applying counterfactual reasoning to eliminate data biases within these paths. The experimental results demonstrate that MuPlon outperforms existing methods and achieves state-of-the-art performance.