Jiadong Yu

LG
h-index8
8papers
119citations
Novelty52%
AI Score52

8 Papers

HCDec 23, 2025
Integrating Brain-Computer Interface and Neuromorphic Computing for Human Digital Twins

Chen Shang, Jiadong Yu, Dinh Thai Hoang

The integration of immersive communication into a human-centric ecosystem has intensified the demand for sophisticated Human Digital Twins (HDTs) driven by multifaceted human data. However, the effective construction of HDTs faces significant challenges due to the heterogeneity of data collection devices, the high energy demands associated with processing intricate data, and concerns over the privacy of sensitive information. This work introduces a novel biologically-inspired (bio-inspired) HDT framework that leverages Brain-Computer Interface (BCI) sensor technology to capture brain signals as the data source for constructing HDT. By collecting and analyzing these signals, the framework not only minimizes device heterogeneity and enhances data collection efficiency, but also provides richer and more nuanced physiological and psychological data for constructing personalized HDTs. To this end, we further propose a bio-inspired neuromorphic computing learning model based on the Spiking Neural Network (SNN). This model utilizes discrete neural spikes to emulate the way of human brain processes information, thereby enhancing the system's ability to process data effectively while reducing energy consumption. Additionally, we integrate a Federated Learning (FL) strategy within the model to strengthen data privacy. We then conduct a case study to demonstrate the performance of our proposed twofold bio-inspired scheme. Finally, we present several challenges and promising directions for future research of HDTs driven by bio-inspired technologies.

LGFeb 13
Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models

Zesheng Hong, Jiadong Yu, Hui Pan

Reinforcement Learning with Verifiable Rewards (RLVR) has established itself as the dominant paradigm for instilling rigorous reasoning capabilities in Large Language Models. While effective at amplifying dominant behaviors, we identify a critical pathology in this alignment process: the systematic suppression of valid but rare (low-likelihood under the base model distribution) reasoning paths. We theoretically characterize this phenomenon as a "Normalization Squeeze," where the interplay between mode-seeking policy gradients and finite sampling acts as a high-pass likelihood filter, driving the probability of rare correct traces to statistical extinction. To counteract this collapse without discarding the base model's latent diversity, we propose Amortized Reasoning Tree Search (ARTS). Unlike standard approaches that force internalization via parameter updates, ARTS prioritizes deliberation by decoupling generation from verification. We introduce a Flow Matching objective that repurposes the verifier to estimate the conservation of probability flow, enabling robust navigation through sparse, high-entropy search spaces where traditional discriminative objectives fail. Extensive experiments on the MATH-500 benchmark demonstrate that ARTS achieves a performance of 74.6% (BoN@16), effectively matching fully fine-tuned policies (74.7%) without modifying the generative backbone. Crucially, on the long-tail subset where coupled RL optimization collapses to 0% pass@k, ARTS uniquely recovers significant performance, suggesting that disentangling verification from generation offers a more robust pathway for solving complex reasoning tasks.

9.8LGMar 24
Spiking Personalized Federated Learning for Brain-Computer Interface-Enabled Immersive Communication

Chen Shang, Dinh Thai Hoang, Diep N. Nguyen et al.

This work proposes a novel immersive communication framework that leverages brain-computer interface (BCI) to acquire brain signals for inferring user-centric states (e.g., intention and perception-related discomfort), thereby enabling more personalized and robust immersive adaptation under strong individual variability. Specifically, we develop a personalized federated learning (PFL) model to analyze and process the collected brain signals, which not only accommodates neurodiverse brain-signal data but also prevents the leakage of sensitive brain-signal information. To address the energy bottleneck of continual on-device learning and inference on energy-limited immersive terminals (e.g., head-mounted display), we further embed spiking neural networks (SNNs) into the PFL. By exploiting sparse, event-driven spike computation, the SNN-enabled PFL reduces the computation and energy cost of training and inference while maintaining competitive personalization performance. Experiments on real brain-signal dataset demonstrate that our method achieves the best overall identification accuracy while reducing inference energy by 6.46$\times$ compared with conventional artificial neural network-based personalized baselines.

LGFeb 11, 2025
Improve the Training Efficiency of DRL for Wireless Communication Resource Allocation: The Role of Generative Diffusion Models

Xinren Zhang, Jiadong Yu

Dynamic resource allocation in mobile wireless networks involves complex, time-varying optimization problems, motivating the adoption of deep reinforcement learning (DRL). However, most existing works rely on pre-trained policies, overlooking dynamic environmental changes that rapidly invalidate the policies. Periodic retraining becomes inevitable but incurs prohibitive computational costs and energy consumption-critical concerns for resource-constrained wireless systems. We identify three root causes of inefficient retraining: high-dimensional state spaces, suboptimal action spaces exploration-exploitation trade-offs, and reward design limitations. To overcome these limitations, we propose Diffusion-based Deep Reinforcement Learning (D2RL), which leverages generative diffusion models (GDMs) to holistically enhance all three DRL components. Iterative refinement process and distribution modelling of GDMs enable (1) the generation of diverse state samples to improve environmental understanding, (2) balanced action space exploration to escape local optima, and (3) the design of discriminative reward functions that better evaluate action quality. Our framework operates in two modes: Mode I leverages GDMs to explore reward spaces and design discriminative reward functions that rigorously evaluate action quality, while Mode II synthesizes diverse state samples to enhance environmental understanding and generalization. Extensive experiments demonstrate that D2RL achieves faster convergence and reduced computational costs over conventional DRL methods for resource allocation in wireless communications while maintaining competitive policy performance. This work underscores the transformative potential of GDMs in overcoming fundamental DRL training bottlenecks for wireless networks, paving the way for practical, real-time deployments.

AIFeb 15, 2024
Federated Prompt-based Decision Transformer for Customized VR Services in Mobile Edge Computing System

Tailin Zhou, Jiadong Yu, Jun Zhang et al.

This paper investigates resource allocation to provide heterogeneous users with customized virtual reality (VR) services in a mobile edge computing (MEC) system. We first introduce a quality of experience (QoE) metric to measure user experience, which considers the MEC system's latency, user attention levels, and preferred resolutions. Then, a QoE maximization problem is formulated for resource allocation to ensure the highest possible user experience,which is cast as a reinforcement learning problem, aiming to learn a generalized policy applicable across diverse user environments for all MEC servers. To learn the generalized policy, we propose a framework that employs federated learning (FL) and prompt-based sequence modeling to pre-train a common decision model across MEC servers, which is named FedPromptDT. Using FL solves the problem of insufficient local MEC data while protecting user privacy during offline training. The design of prompts integrating user-environment cues and user-preferred allocation improves the model's adaptability to various user environments during online execution.

SPAug 27, 2025
Energy-Efficient Learning-Based Beamforming for ISAC-Enabled V2X Networks

Chen Shang, Jiadong Yu, Dinh Thai Hoang

This work proposes an energy-efficient, learning-based beamforming scheme for integrated sensing and communication (ISAC)-enabled V2X networks. Specifically, we first model the dynamic and uncertain nature of V2X environments as a Markov Decision Process. This formulation allows the roadside unit to generate beamforming decisions based solely on current sensing information, thereby eliminating the need for frequent pilot transmissions and extensive channel state information acquisition. We then develop a deep reinforcement learning (DRL) algorithm to jointly optimize beamforming and power allocation, ensuring both communication throughput and sensing accuracy in highly dynamic scenario. To address the high energy demands of conventional learning-based schemes, we embed spiking neural networks (SNNs) into the DRL framework. Leveraging their event-driven and sparsely activated architecture, SNNs significantly enhance energy efficiency while maintaining robust performance. Simulation results confirm that the proposed method achieves substantial energy savings and superior communication performance, demonstrating its potential to support green and sustainable connectivity in future V2X systems.

LGJun 24, 2025
Causal-Aware Intelligent QoE Optimization for VR Interaction with Adaptive Keyframe Extraction

Ziru Zhang, Jiadong Yu, Danny H. K. Tsang

The optimization of quality of experience (QoE) in multi-user virtual reality (VR) interactions demands a delicate balance between ultra-low latency, high-fidelity motion synchronization, and equitable resource allocation. While adaptive keyframe extraction mitigates transmission overhead, existing approaches often overlook the causal relationships among allocated bandwidth, CPU frequency, and user perception, limiting QoE gains. This paper proposes an intelligent framework to maximize QoE by integrating adaptive keyframe extraction with causal-aware reinforcement learning (RL). First, a novel QoE metric is formulated using the Weber-Fechner Law, combining perceptual sensitivity, attention-driven priorities, and motion reconstruction accuracy. The QoE optimization problem is then modeled as a mixed integer programming (MIP) task, jointly optimizing keyframe ratios, bandwidth, and computational resources under horizon-fairness constraints. We propose Partial State Causal Deep Deterministic Policy Gradient (PS-CDDPG), which integrates the Deep Deterministic Policy Gradient (DDPG) method with causal influence detection. By leveraging causal information regarding how QoE is influenced and determined by various actions, we explore actions guided by weights calculated from causal inference (CI), which in turn improves training efficiency. Experiments conducted with the CMU Motion Capture Database demonstrate that our framework significantly reduces interactive latency, enhances QoE, and maintains fairness, achieving superior performance compared to benchmark methods.

SPApr 5, 2020
Multi-agent Reinforcement Learning for Resource Allocation in IoT networks with Edge Computing

Xiaolan Liu, Jiadong Yu, Yue Gao

To support popular Internet of Things (IoT) applications such as virtual reality, mobile games and wearable devices, edge computing provides a front-end distributed computing archetype of centralized cloud computing with low latency. However, it's challenging for end users to offload computation due to their massive requirements on spectrum and computation resources and frequent requests on Radio Access Technology (RAT). In this paper, we investigate computation offloading mechanism with resource allocation in IoT edge computing networks by formulating it as a stochastic game. Here, each end user is a learning agent observing its local environment to learn optimal decisions on either local computing or edge computing with the goal of minimizing long term system cost by choosing its transmit power level, RAT and sub-channel without knowing any information of the other end users. Therefore, a multi-agent reinforcement learning framework is developed to solve the stochastic game with a proposed independent learners based multi-agent Q-learning (IL-based MA-Q) algorithm. Simulations demonstrate that the proposed IL-based MA-Q algorithm is feasible to solve the formulated problem and is more energy efficient without extra cost on channel estimation at the centralized gateway compared to the other two benchmark algorithms.