Peihao Yan

NI
h-index4
5papers
11citations
Novelty50%
AI Score54

5 Papers

73.8NIMay 11Code
Demystifying Deep Reinforcement Learning: A Neuro-Symbolic Framework for Interpretable Open RAN Automation

Jie Lu, Peihao Yan, Pang-Ning Tan et al.

Open Radio Access Networks (O-RAN) are increasingly adopting data-driven control through Deep Reinforcement Learning (DRL) to optimize complex tasks such as network slicing and mobility management. However, the deployment of DRL in carrier-grade networks is hindered by its inherent opacity and stochastic execution, which limit operator trust, auditability, and safe deployment. Existing explainable AI (XAI) approaches primarily provide post-hoc insights and fail to produce executable, interpretable policies suitable for operational environments. In this paper, we present DeRAN, a neuro-symbolic framework that bridges the gap between DRL performance and operational transparency by distilling black-box DRL policies into human-readable symbolic representations. DeRAN introduces a concept-driven abstraction layer that transforms high-dimensional network telemetry into a compact set of semantically meaningful features, enabling interpretable policy learning. Building on the semantically grounded concepts, DeRAN synthesizes symbolic policies using deep symbolic regression (DSR) for continuous control and neurally guided differentiable logic (NUDGE) for discrete decision-making. We implement DeRAN on a live 5G O-RAN testbed and evaluate it on two representative use cases. Experimental results demonstrate that DeRAN achieves 78\% and 87\% of DRL's cumulative rewards in the two use cases, while offering interpretability and auditability by design. Source code is available at https://github.com/Jadejavu/A-Neuro-Symbolic-Framework-for-Interpretable-Open-RAN-Automation

SYSep 17, 2025Code
Near-Real-Time Resource Slicing for QoS Optimization in 5G O-RAN using Deep Reinforcement Learning

Peihao Yan, Jie Lu, Huacheng Zeng et al.

Open-Radio Access Network (O-RAN) has become an important paradigm for 5G and beyond radio access networks. This paper presents an xApp called xSlice for the Near-Real-Time (Near-RT) RAN Intelligent Controller (RIC) of 5G O-RANs. xSlice is an online learning algorithm that adaptively adjusts MAC-layer resource allocation in response to dynamic network states, including time-varying wireless channel conditions, user mobility, traffic fluctuations, and changes in user demand. To address these network dynamics, we first formulate the Quality-of-Service (QoS) optimization problem as a regret minimization problem by quantifying the QoS demands of all traffic sessions through weighting their throughput, latency, and reliability. We then develop a deep reinforcement learning (DRL) framework that utilizes an actor-critic model to combine the advantages of both value-based and policy-based updating methods. A graph convolutional network (GCN) is incorporated as a component of the DRL framework for graph embedding of RAN data, enabling xSlice to handle a dynamic number of traffic sessions. We have implemented xSlice on an O-RAN testbed with 10 smartphones and conducted extensive experiments to evaluate its performance in realistic scenarios. Experimental results show that xSlice can reduce performance regret by 67% compared to the state-of-the-art solutions. Source code is available on GitHub [1].

71.8NIApr 27
TARMM: Scaling Delay-Critical Edge AI Offloading in 5G O-RAN via Temporal Graph Mobility Management

Peihao Yan, Yun Chen, Jie Lu et al.

Emerging delay-critical edge AI applications, such as VR perception and real-time video analytics, impose stringent latency and reliability requirements on 5G networks. However, existing mobility management mechanisms are largely reactive and fail to adapt to dynamic network conditions, resulting in suboptimal handover decisions and degraded performance. In this paper, we present TARMM, a 5G Open Radio Access Network (O-RAN) system that optimizes user mobility management for delay-critical edge AI offloading. The core of TARMM is a temporal graph model that captures the spatiotemporal dynamics of the RAN across users and cells, enabling near real-time handover decisions. Building on this representation, we design a multi-agent reinforcement learning (MARL) framework with rule-based action masking and proactive resource preparation to ensure safe, stable, and efficient handovers. We implement TARMM on a multi-cell indoor 5G O-RAN testbed and evaluate it using diverse VR workloads. Extensive experiments show that TARMM reduces tail latency by up to 44% and packet loss by up to 56% compared to state-of-the-art approaches.

90.9NIMar 12
RadEar: A Self-Supervised RF Backscatter System for Voice Eavesdropping and Separation

Qijun Wang, Peihao Yan, Chunqi Qian et al.

Eavesdropping on voice conversations presents a growing threat to personal privacy and information security. In this paper, we present RadEar, a novel RF backscatter-based system designed to enable covert voice eavesdropping through walls. RadEar consists of two key components: (i) a batteryless RF backscatter tag covertly deployed inside the target space, and (ii) an RF reader located outside the room that performs signal demodulation, voice separation, and denoising. The tag features a compact, dual-resonator design that achieves energy-efficient frequency modulation for continuous voice eavesdropping while mitigating self-interference by separating excitation and reflection frequencies. To overcome the challenges of weak signal reception and overlapping speech, the RF reader employs self-supervised learning models for voice separation and denoising, trained using a remix-based objective without requiring ground-truth labels. We fabricate and evaluate RadEar in real-world scenarios, demonstrating its ability to recover and separate human speech with high fidelity under practical constraints.

SYFeb 9
EExApp: GNN-Based Reinforcement Learning for Radio Unit Energy Optimization in 5G O-RAN

Jie Lu, Peihao Yan, Huacheng Zeng

With over 3.5 million 5G base stations deployed globally, their collective energy consumption (projected to exceed 131 TWh annually) raises significant concerns over both operational costs and environmental impacts. In this paper, we present EExAPP, a deep reinforcement learning (DRL)-based xApp for 5G Open Radio Access Network (O-RAN) that jointly optimizes radio unit (RU) sleep scheduling and distributed unit (DU) resource slicing. EExAPP uses a dual-actor-dual-critic Proximal Policy Optimization (PPO) architecture, with dedicated actor-critic pairs targeting energy efficiency and quality-of-service (QoS) compliance. A transformer-based encoder enables scalable handling of variable user equipment (UE) populations by encoding all-UE observations into fixed-dimensional representations. To coordinate the two optimization objectives, a bipartite Graph Attention Network (GAT) is used to modulate actor updates based on both critic outputs, enabling adaptive tradeoffs between power savings and QoS. We have implemented EExAPP and deployed it on a real-world 5G O-RAN testbed with live traffic, commercial RU and smartphones. Extensive over-the-air experiments and ablation studies confirm that EExAPP significantly outperforms existing methods in reducing the energy consumption of RU while maintaining QoS.