LGDec 31, 2025Code
MSACL: Multi-Step Actor-Critic Learning with Lyapunov Certificates for Exponentially Stabilizing ControlYongwei Zhang, Yuanzhe Xing, Quanyi Liang et al.
For safety-critical applications, model-free reinforcement learning (RL) faces numerous challenges, particularly the difficulty of establishing verifiable stability guarantees while maintaining high exploration efficiency. To address these challenges, we present Multi-Step Actor-Critic Learning with Lyapunov Certificates (MSACL), a novel approach that seamlessly integrates exponential stability with maximum entropy reinforcement learning (MERL). In contrast to existing methods that rely on complex reward engineering and single-step constraints, MSACL utilizes intuitive rewards and multi-step data for actor-critic learning. Specifically, we first introduce Exponential Stability Labels (ESLs) to categorize samples and propose a $λ$-weighted aggregation mechanism to learn Lyapunov certificates. Leveraging these certificates, we then develop a stability-aware advantage function to guide policy optimization, thereby ensuring rapid Lyapunov descent and robust state convergence. We evaluate MSACL across six benchmarks, comprising four stabilization and two high-dimensional tracking tasks. Experimental results demonstrate its consistent superiority over both standard RL baselines and state-of-the-art Lyapunov-based RL algorithms. Beyond rapid convergence, MSACL exhibits significant robustness against environmental uncertainties and remarkable generalization to unseen reference signals. The source code and benchmarking environments are available at \href{https://github.com/YuanZhe-Xing/MSACL}{https://github.com/YuanZhe-Xing/MSACL}.
IRMar 6, 2025
In-depth Analysis of Graph-based RAG in a Unified FrameworkYingli Zhou, Yaodong Su, Youran Sun et al.
Graph-based Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs), improving their factual accuracy, adaptability, interpretability, and trustworthiness. A number of graph-based RAG methods have been proposed in the literature. However, these methods have not been systematically and comprehensively compared under the same experimental settings. In this paper, we first summarize a unified framework to incorporate all graph-based RAG methods from a high-level perspective. We then extensively compare representative graph-based RAG methods over a range of questing-answering (QA) datasets -- from specific questions to abstract questions -- and examine the effectiveness of all methods, providing a thorough analysis of graph-based RAG approaches. As a byproduct of our experimental analysis, we are also able to identify new variants of the graph-based RAG methods over specific QA and abstract QA tasks respectively, by combining existing techniques, which outperform the state-of-the-art methods. Finally, based on these findings, we offer promising research opportunities. We believe that a deeper understanding of the behavior of existing methods can provide new valuable insights for future research.
SPAug 23, 2025
Deep Learning-based Techniques for Integrated Sensing and Communication Systems: State-of-the-Art, Challenges, and OpportunitiesMurat Temiz, Yongwei Zhang, Yanwei Fu et al.
This article comprehensively reviews recent developments and research on deep learning-based (DL-based) techniques for integrated sensing and communication (ISAC) systems. ISAC, which combines sensing and communication functionalities, is regarded as a key enabler for 6G and beyond networks, as many emerging applications, such as vehicular networks and industrial robotics, necessitate both sensing and communication capabilities for effective operation. A unified platform that provides both functions can reduce hardware complexity, alleviate frequency spectrum congestion, and improve energy efficiency. However, integrating these functionalities on the same hardware requires highly optimized signal processing and system design, introducing significant computational complexity when relying on conventional iterative or optimization-based techniques. As an alternative to conventional techniques, DL-based techniques offer efficient and near-optimal solutions with reduced computational complexity. Hence, such techniques are well-suited for operating under limited computational resources and low latency requirements in real-time systems. DL-based techniques can swiftly and effectively yield near-optimal solutions for a wide range of sophisticated ISAC-related tasks, including waveform design, channel estimation, sensing signal processing, data demodulation, and interference mitigation. Therefore, motivated by these advantages, recent studies have proposed various DL-based approaches for ISAC system design. After briefly introducing DL architectures and ISAC fundamentals, this survey presents a comprehensive and categorized review of state-of-the-art DL-based techniques for ISAC, highlights their key advantages and major challenges, and outlines potential directions for future research.
RONov 26, 2024
Self-reconfiguration Strategies for Space-distributed SpacecraftTianle Liu, Zhixiang Wang, Yongwei Zhang et al.
This paper proposes a distributed on-orbit spacecraft assembly algorithm, where future spacecraft can assemble modules with different functions on orbit to form a spacecraft structure with specific functions. This form of spacecraft organization has the advantages of reconfigurability, fast mission response and easy maintenance. Reasonable and efficient on-orbit self-reconfiguration algorithms play a crucial role in realizing the benefits of distributed spacecraft. This paper adopts the framework of imitation learning combined with reinforcement learning for strategy learning of module handling order. A robot arm motion algorithm is then designed to execute the handling sequence. We achieve the self-reconfiguration handling task by creating a map on the surface of the module, completing the path point planning of the robotic arm using A*. The joint planning of the robotic arm is then accomplished through forward and reverse kinematics. Finally, the results are presented in Unity3D.