Shunpu Tang

LG
h-index68
10papers
213citations
Novelty52%
AI Score54

10 Papers

CRApr 23
Privacy-Preserving Semantic Communication over Wiretap Channels with Learnable Differential Privacy

Weixuan Chen, Qianqian Yang, Shuo Shao et al.

While semantic communication (SemCom) improves transmission efficiency by focusing on task-relevant information, it also raises critical privacy concerns. Many existing secure SemCom approaches rely on restrictive or impractical assumptions, such as favorable channel conditions for the legitimate user or prior knowledge of the eavesdropper's model. To address these limitations, this paper proposes a novel secure SemCom framework for image transmission over wiretap channels, leveraging differential privacy (DP) to provide approximate privacy guarantees. Specifically, our approach first extracts disentangled semantic representations from source images using generative adversarial network (GAN) inversion method, and then selectively perturbs private semantic representations with approximate DP noise. Distinct from conventional DP-based protection methods, we introduce DP noise with learnable pattern, instead of traditional white Gaussian or Laplace noise, achieved through adversarial training of neural networks (NNs). This design mitigates the inherent non-invertibility of DP while effectively protecting private information. Moreover, it enables explicitly controllable security levels by adjusting the privacy budget according to specific security requirements, which is not achieved in most existing secure SemCom approaches. Experimental results demonstrate that, compared with the previous DP-based method and direct transmission, the proposed method significantly degrades the reconstruction quality for the eavesdropper, while introducing only slight degradation in task performance. Under comparable security levels, our approach achieves an LPIPS advantage of 0.06-0.29 and an FPPSR advantage of 0.10-0.86 for the legitimate user compared with the previous DP-based method.

ITMar 18
Cache-enabled Generative Joint Source-Channel Coding for Evolving Semantic Communications

Shunpu Tang, Qianqian Yang, Jihong Park et al.

Learning-based semantic communication (SemCom) has recently emerged as a promising paradigm for improving the transmission efficiency of wireless networks. However, existing methods typically rely on extensive end-to-end training, which is both inflexible and computationally expensive in dynamic wireless environments. Moreover, they fail to exploit redundancy across multiple transmissions of semantically similar content, limiting overall efficiency. To overcome these limitations, we propose a channel-aware generative adversarial network (GAN) inversion-based joint source-channel coding (CAGI-JSCC) framework that enables training-free SemCom by leveraging a pre-trained SemanticStyleGAN model. By explicitly incorporating wireless channel characteristics into the GAN inversion process, CAGI-JSCC adapts to varying channel conditions without additional training. Furthermore, we introduce a cache-enabled dynamic codebook (CDC) that caches disentangled semantic components at both the transmitter and receiver, allowing the system to reuse previously transmitted content. This semantic-level caching can continuously reduce redundant transmissions as experience accumulates. Extensive experiments on image transmission demonstrate the effectiveness of the proposed framework. In particular, our system achieves comparable perceptual quality with an average bandwidth compression ratio (BCR) of 1/224, and as low as 1/1024 for a single image, significantly outperforming baselines with a BCR of 1/128.

LGMay 9
Generative Actor-Critic with Soft Bridge Policies

Ke He, Le He, Shunpu Tang et al.

Expressive generative policies such as diffusion and flow models are appealing for MaxEnt online reinforcement learning because of their ability to model multimodal and highly non-Gaussian action distributions. However, training effective soft generative policies faces two obstacles that often arise together. First, marginal action densities are often unavailable, so existing methods typically rely on entropy bounds, heuristic proxies or approximations. Second, iterative shared-parameter samplers raise inference cost and require backpropagation through time over repeated network evaluations, increasing memory cost and destabilizing policy optimization. These obstacles motivate us to seek a generative policy that exposes a tractable MaxEnt objective while requiring only a single sampled actor forward pass for action generation. To this end, we propose soft generative actor-critic (SoftGAC), whose actor defines a stochastic bridge from a fixed base latent to a terminal action latent in pre-tanh space. This structured bridge allows us to lift the MaxEnt objective as an analytically tractable path-wise relative-entropy objective against a high-entropy reference process. In practical finite-step implementation, this relative entropy reduces exactly to sampled transition control energy and thus provides principled soft regularization. Moreover, we keep the single-pass actor lightweight by using small step-specific bridge transitions, each evaluated only once per sampled action, while maintaining a parameter budget comparable to strong actor baselines. Extensive experiments on challenging continuous-control benchmarks show that SoftGAC attains higher or competitive returns than strong generative policy baselines, including diffusion and flow-matching policies, while staying in the low-latency regime of one-pass actors and showing considerable improvements in the compute-return tradeoff.

MAMay 7
AgenticPrecoding: LLM-Empowered Multi-Agent System for Precoding Optimization

Zijiu Yang, Zixiang Zhang, Shunpu Tang et al.

Precoding is a key technique for interference management and performance improvement in multi-antenna wireless systems. However, existing precoding methods are typically developed for specific system models, objectives, and constraint sets, which limits their adaptability to the heterogeneous and evolving scenarios expected in future 6G networks. To address this limitation, we propose AgenticPrecoding, a universal multi-agent framework that automates end-to-end precoding derivation directly from user-level communication requirements. Specifically, AgenticPrecoding decomposes the derivation process into four coordinated stages: problem formulation, solver selection, prompt upsampling, and code generation, assigning each stage to a specialized agent tailored to its specific reasoning demands. We employ two LoRA-adapted reasoning agents to inject precoding-specific domain knowledge for problem formulation and solver selection, while two general-purpose Large Language Models (LLMs) handle prompt refinement and executable code generation. Furthermore, a feedback-driven refinement mechanism is incorporated to enhance code executability, constraint feasibility, and solution quality. Extensive experiments across 10 representative precoding scenarios demonstrate that AgenticPrecoding achieves superior cross-scenario adaptability compared to conventional optimization-based and LLM-based baselines.

AIMay 4
CoVSpec: Efficient Device-Edge Co-Inference for Vision-Language Models via Speculative Decoding

Yuanyuan Jia, Shunpu Tang, Qianqian Yang

Vision-language models (VLMs) have demonstrated strong capabilities in multimodal perception and reasoning. However, deploying large VLMs on mobile devices remains challenging due to their substantial computational and memory demands. A practical alternative is device-edge co-inference, where a lightweight draft VLM on the mobile device collaborates with a larger target VLM on the edge server via speculative decoding. Nevertheless, directly extending speculative decoding to VLMs suffers from severe inefficiency due to excessive visual-token computation and high communication overhead. To address these challenges, we propose CoVSpec, an efficient collaborative speculative decoding framework for VLM inference. Specifically, we first develop a training-free visual token reduction framework that prunes redundant visual tokens on the mobile device by jointly considering query relevance, token activity, and low-rank dependency. Moreover, we design an adaptive drafting strategy that dynamically adjusts both the verification frequency and the draft length. In addition, we introduce a parallel branching mechanism with decoupled verification-correction to improve draft-side utilization during target-side verification and reduce correction-related transmission overhead. Experiments on multiple benchmarks show that CoVSpec achieves up to 2.21x higher throughput than target-only inference and reduces communication overhead by more than 96% compared with baselines, without compromising task accuracy.

LGApr 29, 2024
FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition

Yuxuan Yan, Qianqian Yang, Shunpu Tang et al.

Despite their exceptional performance on various tasks after fine-tuning, pre-trained language models (PLMs) face significant challenges due to growing privacy concerns with data in centralized training methods. We consider federated learning (FL) to fine-tune PLMs in this paper. However, the substantial number of parameters in PLMs poses significant difficulties for client devices with limited communication and computational resources. One promising solution is to exploit parameter-efficient fine-tuning (PEFT) into FL, which trains a much smaller set of parameters than full parameter fine-tuning (FFT). Although remarkably improving training efficiency, PEFT methods may lead to degraded performance especially when data across different clients are non i.i.d, as revealed by experimental results. To overcome this, we propose FeDeRA, which extends and improves a widely used PEFT method, i.e., low-rank adaption (LoRA). FeDeRA follows LoRA by decomposing the weight matrices of the PLMs into low-rank matrices, which allows for more efficient computation and parameter updates during fine-tuning. Different from LoRA which simply initializes these low-rank matrices by random sampling or zeros, the proposed FeDeRA initializes these matrices by the results of performing singular value decomposition (SVD) on the pre-trained weight matrices. Extensive experiments across various tasks and datasets show that FeDeRA outperforms the considered PEFT baselines and is comparable to or even surpasses FFT method within the FL setting in terms of task performance. Moreover, FeDeRA requires only 1% trainable paramentes compared to FFT, significantly reducing training time costs by more than 90% to achieve the same task performance level. The experimental results also highlight the robustness of FeDeRA against data heterogeneity, as it maintains stable task performance even as data heterogeneity increases.

CVApr 18
Generative Semantic Communication via Alternating Dual-Domain Posterior Sampling

Shunpu Tang, Qianqian Yang

Generative semantic communication (SemCom) harnesses pretrained generative priors to improve the perceptual quality of wireless image transmission. Existing generative SemCom receivers, however, rely on maximum a posteriori (MAP) estimation, which fundamentally cannot preserve the data distribution and thus limits achievable perceptual quality. Moreover, current diffusion-based approaches using single-domain guidance face significant limitations: latent-domain guidance is sensitive to channel noise, while image-domain guidance inherits decoder bias. Simply combining both domains simultaneously yields an overconfident pseudo-posterior. In this paper, we formulate semantic decoding as a Bayesian inverse problem and prove that posterior sampling achieves optimal perceptual quality by preserving the data distribution. Building on this insight, we propose alternating dual-domain posterior sampling (ADDPS), a diffusion-based SemCom receiver that alternately enforces latent-domain and image-domain consistency during the sampling process. This alternating strategy decomposes joint posterior sampling into simpler subproblems, avoiding gradient conflicts while retaining the complementary strengths of both domains. Experiments on FFHQ demonstrate that the proposed ADDPS achieves superior perceptual quality compared with existing methods.

NIFeb 24, 2025
Toward Agentic AI: Generative Information Retrieval Inspired Intelligent Communications and Networking

Ruichen Zhang, Shunpu Tang, Yinqiu Liu et al.

The increasing complexity and scale of modern telecommunications networks demand intelligent automation to enhance efficiency, adaptability, and resilience. Agentic AI has emerged as a key paradigm for intelligent communications and networking, enabling AI-driven agents to perceive, reason, decide, and act within dynamic networking environments. However, effective decision-making in telecom applications, such as network planning, management, and resource allocation, requires integrating retrieval mechanisms that support multi-hop reasoning, historical cross-referencing, and compliance with evolving 3GPP standards. This article presents a forward-looking perspective on generative information retrieval-inspired intelligent communications and networking, emphasizing the role of knowledge acquisition, processing, and retrieval in agentic AI for telecom systems. We first provide a comprehensive review of generative information retrieval strategies, including traditional retrieval, hybrid retrieval, semantic retrieval, knowledge-based retrieval, and agentic contextual retrieval. We then analyze their advantages, limitations, and suitability for various networking scenarios. Next, we present a survey about their applications in communications and networking. Additionally, we introduce an agentic contextual retrieval framework to enhance telecom-specific planning by integrating multi-source retrieval, structured reasoning, and self-reflective validation. Experimental results demonstrate that our framework significantly improves answer accuracy, explanation consistency, and retrieval efficiency compared to traditional and semantic retrieval methods. Finally, we outline future research directions.

CVOct 18, 2025
DiffusionX: Efficient Edge-Cloud Collaborative Image Generation with Multi-Round Prompt Evolution

Yi Wei, Shunpu Tang, Liang Zhao et al.

Recent advances in diffusion models have driven remarkable progress in image generation. However, the generation process remains computationally intensive, and users often need to iteratively refine prompts to achieve the desired results, further increasing latency and placing a heavy burden on cloud resources. To address this challenge, we propose DiffusionX, a cloud-edge collaborative framework for efficient multi-round, prompt-based generation. In this system, a lightweight on-device diffusion model interacts with users by rapidly producing preview images, while a high-capacity cloud model performs final refinements after the prompt is finalized. We further introduce a noise level predictor that dynamically balances the computation load, optimizing the trade-off between latency and cloud workload. Experiments show that DiffusionX reduces average generation time by 15.8% compared with Stable Diffusion v1.5, while maintaining comparable image quality. Moreover, it is only 0.9% slower than Tiny-SD with significantly improved image quality, thereby demonstrating efficiency and scalability with minimal overhead.

LGOct 28, 2021
Computational Intelligence and Deep Learning for Next-Generation Edge-Enabled Industrial IoT

Shunpu Tang, Lunyuan Chen, Ke HeJunjuan Xia et al.

In this paper, we investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this system, the IoT devices can collaboratively train a shared model without compromising data privacy. However, due to limited resources in the industrial IoT networks, including computational power, bandwidth, and channel state, it is challenging for many devices to accomplish local training and upload weights to the edge server in time. To address this issue, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework, where the deep model can be divided into several sub-models with different depths and output prediction from the exit in the corresponding sub-model. In this way, the devices with insufficient computational power can choose the earlier exits and avoid training the complete model, which can help reduce computational latency and enable devices to participate into aggregation as much as possible within a latency threshold. Moreover, we propose a greedy approach-based exit selection and bandwidth allocation algorithm to maximize the total number of exits in each communication round. Simulation experiments are conducted on the classical Fashion-MNIST dataset under a non-independent and identically distributed (non-IID) setting, and it shows that the proposed strategy outperforms the conventional FL. In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.