Licheng Wang

CR
h-index19
11papers
168citations
Novelty52%
AI Score53

11 Papers

CVMar 25, 2022Code
Multimodal Pre-training Based on Graph Attention Network for Document Understanding

Zhenrong Zhang, Jiefeng Ma, Jun Du et al.

Document intelligence as a relatively new research topic supports many business applications. Its main task is to automatically read, understand, and analyze documents. However, due to the diversity of formats (invoices, reports, forms, etc.) and layouts in documents, it is difficult to make machines understand documents. In this paper, we present the GraphDoc, a multimodal graph attention-based model for various document understanding tasks. GraphDoc is pre-trained in a multimodal framework by utilizing text, layout, and image information simultaneously. In a document, a text block relies heavily on its surrounding contexts, accordingly we inject the graph structure into the attention mechanism to form a graph attention layer so that each input node can only attend to its neighborhoods. The input nodes of each graph attention layer are composed of textual, visual, and positional features from semantically meaningful regions in a document image. We do the multimodal feature fusion of each node by the gate fusion layer. The contextualization between each node is modeled by the graph attention layer. GraphDoc learns a generic representation from only 320k unlabeled documents via the Masked Sentence Modeling task. Extensive experimental results on the publicly available datasets show that GraphDoc achieves state-of-the-art performance, which demonstrates the effectiveness of our proposed method. The code is available at https://github.com/ZZR8066/GraphDoc.

AIAug 4, 2025Code
SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

Jiaye Lin, Yifu Guo, Yuzhen Han et al.

Large Language Model (LLM)-based agents have recently shown impressive capabilities in complex reasoning and tool use via multi-step interactions with their environments. While these agents have the potential to tackle complicated tasks, their problem-solving process, i.e., agents' interaction trajectory leading to task completion, remains underexploited. These trajectories contain rich feedback that can navigate agents toward the right directions for solving problems correctly. Although prevailing approaches, such as Monte Carlo Tree Search (MCTS), can effectively balance exploration and exploitation, they ignore the interdependence among various trajectories and lack the diversity of search spaces, which leads to redundant reasoning and suboptimal outcomes. To address these challenges, we propose SE-Agent, a Self-Evolution framework that enables Agents to optimize their reasoning processes iteratively. Our approach revisits and enhances former pilot trajectories through three key operations: revision, recombination, and refinement. This evolutionary mechanism enables two critical advantages: (1) it expands the search space beyond local optima by intelligently exploring diverse solution paths guided by previous trajectories, and (2) it leverages cross-trajectory inspiration to efficiently enhance performance while mitigating the impact of suboptimal reasoning paths. Through these mechanisms, SE-Agent achieves continuous self-evolution that incrementally improves reasoning quality. We evaluate SE-Agent on SWE-bench Verified to resolve real-world GitHub issues. Experimental results across five strong LLMs show that integrating SE-Agent delivers up to 55% relative improvement, achieving state-of-the-art performance among all open-source agents on SWE-bench Verified. Our code and demonstration materials are publicly available at https://github.com/JARVIS-Xs/SE-Agent.

LGJan 29
EGAM: Extended Graph Attention Model for Solving Routing Problems

Licheng Wang, Yuzi Yan, Mingtao Huang et al.

Neural combinatorial optimization (NCO) solvers, implemented with graph neural networks (GNNs), have introduced new approaches for solving routing problems. Trained with reinforcement learning (RL), the state-of-the-art graph attention model (GAM) achieves near-optimal solutions without requiring expert knowledge or labeled data. In this work, we generalize the existing graph attention mechanism and propose the extended graph attention model (EGAM). Our model utilizes multi-head dot-product attention to update both node and edge embeddings, addressing the limitations of the conventional GAM, which considers only node features. We employ an autoregressive encoder-decoder architecture and train it with policy gradient algorithms that incorporate a specially designed baseline. Experiments show that EGAM matches or outperforms existing methods across various routing problems. Notably, the proposed model demonstrates exceptional performance on highly constrained problems, highlighting its efficiency in handling complex graph structures.

CRApr 26
Spore: Efficient and Training-Free Privacy Extraction Attack on LLMs via Inference-Time Hybrid Probing

Yu Cui, Ruiqing Yue, Hang Fu et al.

With the wide adoption of personal AI assistants such as OpenClaw, privacy leakage in user interaction contexts with large language model (LLM) agents has become a critical issue. Existing privacy attacks against LLMs primarily target training data, while research on inference-time contextual privacy risks in LLM agent memory remains limited. Moreover, prior methods often incur high attack costs, requiring multiple queries or relying on white-box assumptions, which limits their practicality in real-world deployments. To address these issues, we propose a training-free privacy extraction attack targeting LLM agent memory, which we name \textsc{Spore}. \textsc{Spore} is compatible with both black-box and gray-box settings. In the black-box setting, \textsc{Spore} can efficiently extract a small candidate set via a single query to recover the original private information. In the gray-box setting, \textsc{Spore} allows the attacker to leverage multi-ranked tokens for more accurate and faster privacy extraction. We provide an information-theoretic analysis of \textsc{Spore} and show that it achieves high query efficiency with substantial per query information leakage. Experiments on multiple frontier LLMs show that \textsc{Spore} outperforms attack success rate over existing state-of-the-art (SOTA) schemes. It also maintains low attack cost and remains stable across different model parameter settings. We further evaluate the robustness of \textsc{Spore} against existing defense mechanisms. Our results show that \textsc{Spore} consistently bypasses both detection and strong safety alignment, demonstrating resilient performance in diverse defensive settings and real-world safety threats.

AISep 14, 2025
Free-MAD: Consensus-Free Multi-Agent Debate

Yu Cui, Hang Fu, Haibin Zhang et al.

Multi-agent debate (MAD) is an emerging approach to improving the reasoning capabilities of large language models (LLMs). Existing MAD methods rely on multiple rounds of interaction among agents to reach consensus, and the final output is selected by majority voting in the last round. However, this consensus-based design faces several limitations. First, multiple rounds of communication increases token overhead and limits scalability. Second, due to the inherent conformity of LLMs, agents that initially produce correct responses may be influenced by incorrect ones during the debate process, causing error propagation. Third, majority voting introduces randomness and unfairness in the decision-making phase, and can degrade the reasoning performance. To address these issues, we propose \textsc{Free-MAD}, a novel MAD framework that eliminates the need for consensus among agents. \textsc{Free-MAD} introduces a novel score-based decision mechanism that evaluates the entire debate trajectory rather than relying on the last round only. This mechanism tracks how each agent's reasoning evolves, enabling more accurate and fair outcomes. In addition, \textsc{Free-MAD} reconstructs the debate phase by introducing anti-conformity, a mechanism that enables agents to mitigate excessive influence from the majority. Experiments on eight benchmark datasets demonstrate that \textsc{Free-MAD} significantly improves reasoning performance while requiring only a single-round debate and thus reducing token costs. We also show that compared to existing MAD approaches, \textsc{Free-MAD} exhibits improved robustness in real-world attack scenarios.

LGNov 27, 2019
SecureGBM: Secure Multi-Party Gradient Boosting

Zhi Fengy, Haoyi Xiong, Chuanyuan Song et al.

Federated machine learning systems have been widely used to facilitate the joint data analytics across the distributed datasets owned by the different parties that do not trust each others. In this paper, we proposed a novel Gradient Boosting Machines (GBM) framework SecureGBM built-up with a multi-party computation model based on semi-homomorphic encryption, where every involved party can jointly obtain a shared Gradient Boosting machines model while protecting their own data from the potential privacy leakage and inferential identification. More specific, our work focused on a specific "dual--party" secure learning scenario based on two parties -- both party own an unique view (i.e., attributes or features) to the sample group of samples while only one party owns the labels. In such scenario, feature and label data are not allowed to share with others. To achieve the above goal, we firstly extent -- LightGBM -- a well known implementation of tree-based GBM through covering its key operations for training and inference with SEAL homomorphic encryption schemes. However, the performance of such re-implementation is significantly bottle-necked by the explosive inflation of the communication payloads, based on ciphertexts subject to the increasing length of plaintexts. In this way, we then proposed to use stochastic approximation techniques to reduced the communication payloads while accelerating the overall training procedure in a statistical manner. Our experiments using the real-world data showed that SecureGBM can well secure the communication and computation of LightGBM training and inference procedures for the both parties while only losing less than 3% AUC, using the same number of iterations for gradient boosting, on a wide range of benchmark datasets.

LGOct 18, 2016
Improving Covariance-Regularized Discriminant Analysis for EHR-based Predictive Analytics of Diseases

Sijia Yang, Haoyi Xiong, Kaibo Xu et al.

Linear Discriminant Analysis (LDA) is a well-known technique for feature extraction and dimension reduction. The performance of classical LDA, however, significantly degrades on the High Dimension Low Sample Size (HDLSS) data for the ill-posed inverse problem. Existing approaches for HDLSS data classification typically assume the data in question are with Gaussian distribution and deal the HDLSS classification problem with regularization. However, these assumptions are too strict to hold in many emerging real-life applications, such as enabling personalized predictive analysis using Electronic Health Records (EHRs) data collected from an extremely limited number of patients who have been diagnosed with or without the target disease for prediction. In this paper, we revised the problem of predictive analysis of disease using personal EHR data and LDA classifier. To fill the gap, in this paper, we first studied an analytical model that understands the accuracy of LDA for classifying data with arbitrary distribution. The model gives a theoretical upper bound of LDA error rate that is controlled by two factors: (1) the statistical convergence rate of (inverse) covariance matrix estimators and (2) the divergence of the training/testing datasets to fitted distributions. To this end, we could lower the error rate by balancing the two factors for better classification performance. Hereby, we further proposed a novel LDA classifier De-Sparse that leverages De-sparsified Graphical Lasso to improve the estimation of LDA, which outperforms state-of-the-art LDA approaches developed for HDLSS data. Such advances and effectiveness are further demonstrated by both theoretical analysis and extensive experiments on EHR datasets.

CRMay 21, 2016
A Miniature CCA2 Public key Encryption scheme based on non-Abelian factorization problems in Lie Groups

Haibo Hong, Licheng Wang, Jun Shao et al.

Since 1870s, scientists have been taking deep insight into Lie groups and Lie algebras. With the development of Lie theory, Lie groups have got profound significance in many branches of mathematics and physics. In Lie theory, exponential mapping between Lie groups and Lie algebras plays a crucial role. Exponential mapping is the mechanism for passing information from Lie algebras to Lie groups. Since many computations are performed much more easily by employing Lie algebras, exponential mapping is indispensable while studying Lie groups. In this paper, we first put forward a novel idea of designing cryptosystem based on Lie groups and Lie algebras. Besides, combing with discrete logarithm problem(DLP) and factorization problem(FP), we propose some new intractable assumptions based on exponential mapping. Moreover, in analog with Boyen's sceme(AsiaCrypt 2007), we disign a public key encryption scheme based on non-Abelian factorization problems in Lie Groups. Finally, our proposal is proved to be IND-CCA2 secure in the random oracle model.

CRMay 21, 2016
Public Key Encryption in Non-Abelian Groups

Haibo Hong, Jun Shao, Licheng Wang et al.

In this paper, we propose a brand new public key encryption scheme in the Lie group that is a non-abelian group. In particular, we firstly investigate the intractability assumptions in the Lie group, including the non-abelian factoring assumption and non-abelian inserting assumption. After that, by using the FO technique, a CCA secure public key encryption scheme in the Lie group is proposed. At last, we present the security proof in the random oracle based on the non-abelian inserting assumption.

CRJul 5, 2015
Minimal Logarithmic Signatures for one type of Classical Groups

Haibo Hong, Licheng Wang, Haseeb Ahmad et al.

As a special type of factorization of finite groups, logarithmic signature (LS) is used as the main component of cryptographic keys for secret key cryptosystems such as PGM and public key cryptosystems like MST1, MST2 and MST3. An LS with the shortest length, called a minimal logarithmic signature (MLS), is even desirable for cryptographic applications. The MLS conjecture states that every finite simple group has an MLS. Recently, the conjecture has been shown to be true for general linear groups GLn(q), special linear groups SLn(q), and symplectic groups Spn(q) with q a power of primes and for orthogonal groups On(q) with q as a power of 2. In this paper, we present new constructions of minimal logarithmic signatures for the orthogonal group On(q) and SOn(q) with q as a power of odd primes. Furthermore, we give constructions of MLSs for a type of classical groups projective commutator subgroup.

CRJul 5, 2015
Minimal Logarithmic Signatures for Sporadic Groups

Haibo Hong, Licheng Wang, Haseeb Ahmad et al.

As a special type of factorization of finite groups, logarithmic signature (LS) is used as the main component of cryptographic keys for secret key cryptosystems such as PGM and public key cryptosystems like MST1, MST2 and MST3. An LS with the shortest length is called a minimal logarithmic signature (MLS) and is even desirable for cryptographic constructions. The MLS conjecture states that every finite simple group has an MLS. Until now, the MLS conjecture has been proved true for some families of simple groups. In this paper, we will prove the existence of minimal logarithmic signatures for some sporadic groups.