CLMar 29
PRBench: End-to-end Paper Reproduction in Physics ResearchShi Qiu, Junyi Deng, Yiwei Deng et al.
AI agents powered by large language models exhibit strong reasoning and problem-solving capabilities, enabling them to assist scientific research tasks such as formula derivation and code generation. However, whether these agents can reliably perform end-to-end reproduction from real scientific papers remains an open question. We introduce PRBench, a benchmark of 30 expert-curated tasks spanning 11 subfields of physics. Each task requires an agent to comprehend the methodology of a published paper, implement the corresponding algorithms from scratch, and produce quantitative results matching the original publication. Agents are provided only with the task instruction and paper content, and operate in a sandboxed execution environment. All tasks are contributed by domain experts from over 20 research groups at the School of Physics, Peking University, each grounded in a real published paper and validated through end-to-end reproduction with verified ground-truth results and detailed scoring rubrics. Using an agentified assessment pipeline, we evaluate a set of coding agents on PRBench and analyze their capabilities across key dimensions of scientific reasoning and execution. The best-performing agent, OpenAI Codex powered by GPT-5.3-Codex, achieves a mean overall score of 34%. All agents exhibit a zero end-to-end callback success rate, with particularly poor performance in data accuracy and code correctness. We further identify systematic failure modes, including errors in formula implementation, inability to debug numerical simulations, and fabrication of output data. Overall, PRBench provides a rigorous benchmark for evaluating progress toward autonomous scientific research.
SPApr 29
Hybrid Digital and Microwave Linear Analog Computer (MiLAC)-aided Beamforming for Multiuser MIMO-OFDM SystemsYiyang Peng, Zheyu Wu, Bruno Clerckx
Microwave linear analog computing (MiLAC) has recently emerged as a promising architecture for analog-domain beamforming. In particular, a hybrid digital-MiLAC architecture was proposed and was shown to achieve fully-digital beamforming flexibility in narrowband systems when the number of RF chains equals the number of data streams. However, its performance in wideband systems remains unexplored. This paper presents the first study of hybrid digital-MiLAC beamforming for wideband multi-user multiple-input single-output (MU-MISO) systems. We first characterize the minimum number of radio-frequency (RF) chains required for hybrid digital-MiLAC beamforming to realize an arbitrary set of fully-digital beamforming matrices across all subcarriers. It turns out that, unlike in the narrowband case, a larger number of RF chains is generally required in frequency-selective channels to achieve fully-digital beamforming flexibility, which may be unfavorable in practice. To study the performance of hybrid digital-MiLAC beamforming with a limited number of RF chains, we then formulate the average sum-rate maximization problem and develop an efficient weighted minimum mean-square error (WMMSE)-based algorithm for beamforming design. Simulation results show that hybrid digital-MiLAC beamforming consistently outperforms conventional hybrid digital-analog beamforming, and achieves $89.93\%$ of the fully-digital sum-rate while using only $12.5\%$ of the RF chains in highly frequency-selective channels.
LGNov 11, 2025
From Sequential to Recursive: Enhancing Decision-Focused Learning with Bidirectional FeedbackXinyu Wang, Jinxiao Du, Yiyang Peng et al.
Decision-focused learning (DFL) has emerged as a powerful end-to-end alternative to conventional predict-then-optimize (PTO) pipelines by directly optimizing predictive models through downstream decision losses. Existing DFL frameworks are limited by their strictly sequential structure, referred to as sequential DFL (S-DFL). However, S-DFL fails to capture the bidirectional feedback between prediction and optimization in complex interaction scenarios. In view of this, we first time propose recursive decision-focused learning (R-DFL), a novel framework that introduces bidirectional feedback between downstream optimization and upstream prediction. We further extend two distinct differentiation methods: explicit unrolling via automatic differentiation and implicit differentiation based on fixed-point methods, to facilitate efficient gradient propagation in R-DFL. We rigorously prove that both methods achieve comparable gradient accuracy, with the implicit method offering superior computational efficiency. Extensive experiments on both synthetic and real-world datasets, including the newsvendor problem and the bipartite matching problem, demonstrate that R-DFL not only substantially enhances the final decision quality over sequential baselines but also exhibits robust adaptability across diverse scenarios in closed-loop decision-making problems.
LGNov 27, 2024
SPO-VCS: An End-to-End Smart Predict-then-Optimize Framework with Alternating Differentiation Method for Relocation Problems in Large-Scale Vehicle Crowd SensingXinyu Wang, Yiyang Peng, Wei Ma
Ubiquitous mobile devices have catalyzed the development of vehicle crowd sensing (VCS). In particular, vehicle sensing systems show great potential in the flexible acquisition of spatio-temporal urban data through built-in sensors under diverse sensing scenarios. However, vehicle systems often exhibit biased coverage due to the heterogeneous nature of trip requests and routes. To achieve a high sensing coverage, a critical challenge lies in optimally relocating vehicles to minimize the divergence between vehicle distributions and target sensing distributions. Conventional approaches typically employ a two-stage predict-then-optimize (PTO) process: first predicting real-time vehicle distributions and subsequently generating an optimal relocation strategy based on the predictions. However, this approach can lead to suboptimal decision-making due to the propagation of errors from upstream prediction. To this end, we develop an end-to-end Smart Predict-then-Optimize (SPO) framework by integrating optimization into prediction within the deep learning architecture, and the entire framework is trained by minimizing the task-specific matching divergence rather than the upstream prediction error. Methodologically, we formulate the vehicle relocation problem by quadratic programming (QP) and incorporate a novel unrolling approach based on the Alternating Direction Method of Multipliers (ADMM) within the SPO framework to compute gradients of the QP layer, facilitating backpropagation and gradient-based optimization for end-to-end learning. The effectiveness of the proposed framework is validated by real-world taxi datasets in Hong Kong. Utilizing the alternating differentiation method, the general SPO framework presents a novel concept of addressing decision-making problems with uncertainty, demonstrating significant potential for advancing applications in intelligent transportation systems.