CLOct 8, 2025
Adaptive Tool Generation with Models as Tools and Reinforcement LearningChenpeng Wang, Xiaojie Cheng, Chunye Wang et al.
Tool-augmented language models have demonstrated strong capabilities, but their reliance on live API access creates scalability and reliability challenges during training and deployment. We propose MTR, a simulation-first training framework for tool-augmented reasoning. Instead of relying on live APIs, MTR learns from complete ReAct traces with schema-validated, simulated observations. Our approach operates through a multi-agent architecture where a ToolMaker generates task-specific, OpenAI-compatible tool interfaces, an AutoAgent produces structured think-act-observe sequences, and a ToolActor simulates realistic responses. Training proceeds in two stages: Stage-1 Supervised Fine-Tuning (SFT) teaches 'trace grammar' from complete reasoning sequences; Stage-2 Group Relative Policy Optimization (GRPO) optimizes strategy with a composite trace reward that balances answer correctness and internal consistency. Across four multi-hop QA benchmarks (HotpotQA, MuSiQue, 2WikiMultiHopQA, Bamboogle), MTR attains competitive Exact Match (EM) scores to live-API systems and excels on reasoning-intensive tasks, suggesting that effective tool reasoning can be learned from structured traces without live interactions.
SPMay 30, 2020
Sequence to Point Learning Based on Bidirectional Dilated Residual Network for Non Intrusive Load MonitoringZiyue Jia, Linfeng Yang, Zhenrong Zhang et al.
Non Intrusive Load Monitoring (NILM) or Energy Disaggregation (ED), seeks to save energy by decomposing corresponding appliances power reading from an aggregate power reading of the whole house. It is a single channel blind source separation problem (SCBSS) and difficult prediction problem because it is unidentifiable. Recent research shows that deep learning has become a growing popularity for NILM problem. The ability of neural networks to extract load features is closely related to its depth. However, deep neural network is difficult to train because of exploding gradient, vanishing gradient and network degradation. To solve these problems, we propose a sequence to point learning framework based on bidirectional (non-casual) dilated convolution for NILM. To be more convincing, we compare our method with the state of art method, Seq2point (Zhang) directly and compare with existing algorithms indirectly via two same datasets and metrics. Experiments based on REDD and UK-DALE data sets show that our proposed approach is far superior to existing approaches in all appliances.
CVNov 26, 2018
Low-Dose CT via Deep CNN with Skip Connection and Network in NetworkChenyu You, Linfeng Yang, Yi Zhang et al.
A major challenge in computed tomography (CT) is how to minimize patient radiation exposure without compromising image quality and diagnostic performance. The use of deep convolutional (Conv) neural networks for noise reduction in Low-Dose CT (LDCT) images has recently shown a great potential in this important application. In this paper, we present a highly efficient and effective neural network model for LDCT image noise reduction. Specifically, to capture local anatomical features we integrate Deep Convolutional Neural Networks (CNNs) and Skip connection layers for feature extraction. Also, we introduce parallelized $1\times 1$ CNN, called Network in Network, to lower the dimensionality of the output from the previous layer, achieving faster computational speed at less feature loss. To optimize the performance of the network, we adopt a Wasserstein generative adversarial network (WGAN) framework. Quantitative and qualitative comparisons demonstrate that our proposed network model can produce images with lower noise and more structural details than state-of-the-art noise-reduction methods.