Wenwei Li

CL
h-index4
5papers
8citations
Novelty57%
AI Score50

5 Papers

CLFeb 26
Search-P1: Path-Centric Reward Shaping for Stable and Efficient Agentic RAG Training

Tianle Xia, Ming Xu, Lingxiang Hu et al.

Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by incorporating external knowledge, yet traditional single-round retrieval struggles with complex multi-step reasoning. Agentic RAG addresses this by enabling LLMs to dynamically decide when and what to retrieve, but current RL-based training methods suffer from sparse outcome rewards that discard intermediate signals and low sample efficiency where failed samples contribute nothing. We propose Search-P1, a framework that introduces path-centric reward shaping for agentic RAG training, comprising two key components: (1) Path-Centric Reward, which evaluates the structural quality of reasoning trajectories through order-agnostic step coverage and soft scoring that extracts learning signals even from failed samples, and (2) Dual-Track Path Scoring with offline-generated reference planners that assesses paths from both self-consistency and reference-alignment perspectives. Experiments on multiple QA benchmarks demonstrate that Search-P1 achieves significant improvements over Search-R1 and other strong baselines, with an average accuracy gain of 7.7 points.

CLFeb 26
Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA

Wenwei Li, Ming Xu, Tianle Xia et al.

Industrial advertising question answering (QA) is a high-stakes task in which hallucinated content, particularly fabricated URLs, can lead to financial loss, compliance violations, and legal risk. Although Retrieval-Augmented Generation (RAG) is widely adopted, deploying it in production remains challenging because industrial knowledge is inherently relational, frequently updated, and insufficiently aligned with generation objectives. We propose a reinforced co-adaptation framework that jointly optimizes retrieval and generation through two components: (1) Graph-aware Retrieval (GraphRAG), which models entity-relation structure over a high-citation knowledge subgraph for multi-hop, domain-specific evidence selection; and (2) evidence-constrained reinforcement learning via Group Relative Policy Optimization (GRPO) with multi-dimensional rewards covering faithfulness, style compliance, safety, and URL validity. Experiments on an internal advertising QA dataset show consistent gains across expert-judged dimensions including accuracy, completeness, and safety, while reducing the hallucination rate by 72\%. A two-week online A/B test demonstrates a 28.6\% increase in like rate, a 46.2\% decrease in dislike rate, and a 92.7\% reduction in URL hallucination. The system has been running in production for over half a year and has served millions of QA interactions.

CLFeb 15Code
AD-Bench: A Real-World, Trajectory-Aware Advertising Analytics Benchmark for LLM Agents

Lingxiang Hu, Yiding Sun, Tianle Xia et al.

While Large Language Model (LLM) agents have achieved remarkable progress in complex reasoning tasks, evaluating their performance in real-world environments has become a critical problem. Current benchmarks, however, are largely restricted to idealized simulations, failing to address the practical demands of specialized domains like advertising and marketing analytics. In these fields, tasks are inherently more complex, often requiring multi-round interaction with professional marketing tools. To address this gap, we propose AD-Bench, a benchmark designed based on real-world business requirements of advertising and marketing platforms. AD-Bench is constructed from real user marketing analysis requests, with domain experts providing verifiable reference answers and corresponding reference tool-call trajectories. The benchmark categorizes requests into three difficulty levels (L1-L3) to evaluate agents' capabilities under multi-round, multi-tool collaboration. Experiments show that on AD-Bench, Gemini-3-Pro achieves Pass@1 = 68.0% and Pass@3 = 83.0%, but performance drops significantly on L3 to Pass@1 = 49.4% and Pass@3 = 62.1%, with a trajectory coverage of 70.1%, indicating that even state-of-the-art models still exhibit substantial capability gaps in complex advertising and marketing analysis scenarios. AD-Bench provides a realistic benchmark for evaluating and improving advertising marketing agents, the leaderboard and code can be found at https://github.com/Emanual20/adbench-leaderboard.

ITMar 8
Pushing Bistatic Wireless Sensing toward High Accuracy at the Sub-Wavelength Scale

Wenwei Li, Jiarun Zhou, Qinxiao Quan et al.

Contactless sensing using wireless communication signals has garnered significant attention due to its non-intrusive nature and ubiquitous infrastructure. Despite the promise, the inherent bistatic deployment of wireless communication introduces clock asynchronism, which leads to unknown phase offsets in channel response and hinders fine-grained sensing. State-of-the-art systems widely adopt the cross-antenna channel ratio to cancel these detrimental phase offsets. However, the channel ratio preserves sensing feature accuracy only at integer-wavelength target displacements, losing sub-wavelength fidelity. To overcome this limitation, we derive the first quantitative mapping between the distorted ratio feature and the ideal channel feature. Building on this foundation, we develop a robust framework that leverages channel response amplitude to recover the ideal channel feature from the distorted ratio. Real-world experiments across Wi-Fi and LoRa demonstrate that our method can effectively reconstruct sub-wavelength displacement details, achieving nearly an order-of-magnitude improvement in accuracy.

CVApr 23, 2025
A Few-Shot Metric Learning Method with Dual-Channel Attention for Cross-Modal Same-Neuron Identification

Wenwei Li, Liyi Cai, Wu Chen et al.

In neuroscience research, achieving single-neuron matching across different imaging modalities is critical for understanding the relationship between neuronal structure and function. However, modality gaps and limited annotations present significant challenges. We propose a few-shot metric learning method with a dual-channel attention mechanism and a pretrained vision transformer to enable robust cross-modal neuron identification. The local and global channels extract soma morphology and fiber context, respectively, and a gating mechanism fuses their outputs. To enhance the model's fine-grained discrimination capability, we introduce a hard sample mining strategy based on the MultiSimilarityMiner algorithm, along with the Circle Loss function. Experiments on two-photon and fMOST datasets demonstrate superior Top-K accuracy and recall compared to existing methods. Ablation studies and t-SNE visualizations validate the effectiveness of each module. The method also achieves a favorable trade-off between accuracy and training efficiency under different fine-tuning strategies. These results suggest that the proposed approach offers a promising technical solution for accurate single-cell level matching and multimodal neuroimaging integration.