DCMay 17
TSFLora: Token-Compressed Split Fine-Tuning for Wireless Edge NetworksXianke Qiang, Zheng Chang, Li Wang et al.
Adapting large AI models (LAMs) to personalized edge data is challenging because wireless devices have limited memory, computation, and uplink capacity. Federated fine-tuning preserves data privacy but still requires each device to host the full model, while split learning reduces device memory at the cost of heavy activation transmission. This paper proposes TSFLora, a token-compressed split fine-tuning framework for communication-efficient LAM adaptation at the edge. TSFLora combines attention-guided token selection, token merging, low-bit activation quantization, and LoRA-based adaptation within a split federated training pipeline. The key idea is to compress the intermediate token sequence before transmission so that the system reduces both uplink traffic and server-side processing without changing the frozen backbone. Experiments on ViT models over CIFAR-10, CIFAR-100, and TinyImageNet show that TSFLora achieves up to \textbf{6.8$\times$} communication reduction and \textbf{41\%} memory saving while maintaining competitive accuracy.
DCApr 21
Semantic-aware Token Selection and Resource Optimization for Communication-efficient Split Federated Fine-tuning in Edge IntelligenceXianke Qiang, Zheng Chang, Geyong Min
Deploying large Transformer-based vision models on resource-limited mobile devices at network edge is severely constrained by hardware limitations and dynamic wireless environments. While federated learning (FL) enables collaborative training without sharing raw data, strictly local fine-tuning of such massive models remains computationally prohibitive for edge devices. Split federated learning (SFL) alleviates this burden by offloading deep layers to the edge server, yet it suffers from heavy communication overhead when transmitting high-dimensional activation tokens. To address this bottleneck, we propose ST-SFLora, a semantic token-based split federated LoRA fine-tuning framework. We introduce a new metric, \emph{Semantic Transmission Efficiency} (STE), to balance semantic retention and transmission cost. Based on STE, we formulate a joint resource optimization problem that dynamically determines token selection, uplink bandwidth allocation, and transmit power under latency and energy constraints. The resulting mixed-integer nonconvex problem is efficiently solved via an alternating algorithm. Experiments on multiple benchmarks demonstrate that ST-SFLora achieves the lowest client-side resource consumption among baselines while delivering a favorable trade-off between communication efficiency and model performance.
LGApr 12, 2025
Deploying Large AI Models on Resource-Limited Devices with Split Federated LearningXianke Qiang, Hongda Liu, Xinran Zhang et al.
Large Artificial Intelligence Models (LAMs) powered by massive datasets, extensive parameter scales, and extensive computational resources, leading to significant transformations across various industries. Yet, their practical deployment on resource-limited mobile edge devices is hindered by critical challenges such as data privacy, constrained resources, and high overhead costs. Addressing this gap, this paper proposes a novel framework, named Quantized Split Federated Fine-Tuning Large AI Model (SFLAM). By partitioning the training load between edge devices and servers using a split learning paradigm, SFLAM can facilitate the operation of large models on devices and significantly lowers the memory requirements on edge devices. Additionally, SFLAM incorporates quantization management, power control, and bandwidth allocation strategies to enhance training efficiency while concurrently reducing energy consumption and communication latency. A theoretical analysis exploring the latency-energy trade-off is presented, and the framework's efficacy is validated via comprehensive simulations. The findings indicate that SFLAM achieves superior performance in terms of learning efficiency and scalability compared to conventional methods, thereby providing a valuable approach for enabling advanced AI services in resource-constrained scenarios.
LGMar 26, 2025
AIGC-assisted Federated Learning for Edge Intelligence: Architecture Design, Research Challenges and Future DirectionsXianke Qiang, Zheng Chang, Ying-Chang Liang
Federated learning (FL) can fully leverage large-scale terminal data while ensuring privacy and security, and is considered as a distributed alternative for the centralized machine learning. However, the issue of data heterogeneity poses limitations on FL's performance. To address this challenge, artificial intelligence-generated content (AIGC) which is an innovative data synthesis technique emerges as one potential solution. In this article, we first provide an overview of the system architecture, performance metrics, and challenges associated with AIGC-assistant FL system design. We then propose the Generative federated learning (GenFL) architecture and present its workflow, including the design of aggregation and weight policy. Finally, using the CIFAR10 and CIFAR100 datasets, we employ diffusion models to generate dataset and improve FL performance. Experiments conducted under various non-independent and identically distributed (non-IID) data distributions demonstrate the effectiveness of GenFL on overcoming the bottlenecks in FL caused by data heterogeneity. Open research directions in the research of AIGC-assisted FL are also discussed.