AISep 5, 2025
Multi-Modal Multi-Task (M3T) Federated Foundation Models for Embodied AI: Potentials and Challenges for Edge IntegrationKasra Borazjani, Payam Abdisarabshali, Fardis Nadimi et al.
As embodied AI systems become increasingly multi-modal, personalized, and interactive, they must learn effectively from diverse sensory inputs, adapt continually to user preferences, and operate safely under resource and privacy constraints. These challenges expose a pressing need for machine learning models capable of swift, context-aware adaptation while balancing model generalization and personalization. Here, two methods emerge as suitable candidates, each offering parts of these capabilities: multi-modal multi-task foundation models (M3T-FMs) provide a pathway toward generalization across tasks and modalities, whereas federated learning (FL) offers the infrastructure for distributed, privacy-preserving model updates and user-level model personalization. However, when used in isolation, each of these approaches falls short of meeting the complex and diverse capability requirements of real-world embodied AI environments. In this vision paper, we introduce multi-modal multi-task federated foundation models (M3T-FFMs) for embodied AI, a new paradigm that unifies the strengths of M3T-FMs with the privacy-preserving distributed training nature of FL, enabling intelligent systems at the wireless edge. We collect critical deployment dimensions of M3T-FFMs in embodied AI ecosystems under a unified framework, which we name "EMBODY": Embodiment heterogeneity, Modality richness and imbalance, Bandwidth and compute constraints, On-device continual learning, Distributed control and autonomy, and Yielding safety, privacy, and personalization. For each, we identify concrete challenges and envision actionable research directions. We also present an evaluation framework for deploying M3T-FFMs in embodied AI systems, along with the associated trade-offs. Finally, we present a prototype implementation of M3T-FFMs and evaluate their energy and latency performance.
LGAug 8, 2022
Learning-Based Client Selection for Federated Learning Services Over Wireless Networks with Constrained Monetary BudgetsZhipeng Cheng, Xuwei Fan, Minghui Liwang et al.
We investigate a data quality-aware dynamic client selection problem for multiple federated learning (FL) services in a wireless network, where each client offers dynamic datasets for the simultaneous training of multiple FL services, and each FL service demander has to pay for the clients under constrained monetary budgets. The problem is formalized as a non-cooperative Markov game over the training rounds. A multi-agent hybrid deep reinforcement learning-based algorithm is proposed to optimize the joint client selection and payment actions, while avoiding action conflicts. Simulation results indicate that our proposed algorithm can significantly improve training performance.
67.4NIApr 15
Look One Step Ahead: Forward-Looking Incentive Design with Strategic Privacy for Proactive Service Provisioning over Air-Ground Integrated Edge NetworksSicheng Wu, Minghui Liwang, Yangyang Gao et al.
In air-ground integrated networks (AGINs), unmanned aerial vehicles (UAVs) provide on-demand edge services to ground vehicles. Realizing this vision requires carefully designed incentives to coordinate interactions among self-interested participants. This is exacerbated by the dynamic nature of AGINs, where spatio-temporal variations introduce significant uncertainty in matching UAVs and vehicles. Existing real-time service provisioning typically relies on precise trajectory information, raising privacy concerns and incurring decision latency. To address these challenges, we propose look one-step ahead (LOSA), a novel framework for efficient and privacy-aware service provisioning. By exploiting predictable vehicle travel times between intersections, LOSA decomposes the process into two coupled phases: (i) a privacy-aware look-ahead phase and (ii) a lightweight real-time execution phase. The look-ahead phase allows vehicles to adaptively adjust privacy budgets based on historical utility, balancing trajectory exposure and matching accuracy. Leveraging this, a double auction mechanism establishes binding one-step-ahead agreements (OSAAs) through trajectory similarity clustering, while constructing preference lists to hedge against mobility uncertainty. The execution phase then enforces pre-established OSAAs and preference lists, resolving real-time resource conflicts without costly re-negotiations. This design reduces computational overhead and preserves robustness. We analytically corroborate that LOSA guarantees truthfulness, individual rationality, and budget balance. Experiments on real-world datasets (DAIR-V2X, HighD, and RCooper) demonstrate that LOSA achieves superior privacy protection while lowering transaction latency compared to baseline approaches.
LGJan 20
ELSA: Efficient LLM-Centric Split Aggregation for Privacy-Aware Hierarchical Federated Learning over Resource-Constrained Edge NetworksXiaohong Yang, Tong Xie, Minghui Liwang et al.
Training large language models (LLMs) at the network edge faces fundamental challenges arising from device resource constraints, severe data heterogeneity, and heightened privacy risks. To address these, we propose ELSA (Efficient LLM-centric Split Aggregation), a novel framework that systematically integrates split learning (SL) and hierarchical federated learning (HFL) for distributed LLM fine-tuning over resource-constrained edge networks. ELSA introduces three key innovations. First, it employs a task-agnostic, behavior-aware client clustering mechanism that constructs semantic fingerprints using public probe inputs and symmetric KL divergence, further enhanced by prediction-consistency-based trust scoring and latency-aware edge assignment to jointly address data heterogeneity, client unreliability, and communication constraints. Second, it splits the LLM into three parts across clients and edge servers, with the cloud used only for adapter aggregation, enabling an effective balance between on-device computation cost and global convergence stability. Third, it incorporates a lightweight communication scheme based on computational sketches combined with semantic subspace orthogonal perturbation (SS-OP) to reduce communication overhead while mitigating privacy leakage during model exchanges. Experiments across diverse NLP tasks demonstrate that ELSA consistently outperforms state-of-the-art methods in terms of adaptability, convergence behavior, and robustness, establishing a scalable and privacy-aware solution for edge-side LLM fine-tuning under resource constraints.
LGJun 4, 2022
Distributed Machine Learning in D2D-Enabled Heterogeneous Networks: Architectures, Performance, and Open ChallengesZhipeng Cheng, Xuwei Fan, Minghui Liwang et al.
The ever-growing concerns regarding data privacy have led to a paradigm shift in machine learning (ML) architectures from centralized to distributed approaches, giving rise to federated learning (FL) and split learning (SL) as the two predominant privacy-preserving ML mechanisms. However,implementing FL or SL in device-to-device (D2D)-enabled heterogeneous networks with diverse clients presents substantial challenges, including architecture scalability and prolonged training delays. To address these challenges, this article introduces two innovative hybrid distributed ML architectures, namely, hybrid split FL (HSFL) and hybrid federated SL (HFSL). Such architectures combine the strengths of both FL and SL in D2D-enabled heterogeneous wireless networks. We provide a comprehensive analysis of the performance and advantages of HSFL and HFSL, while also highlighting open challenges for future exploration. We support our proposals with preliminary simulations using three datasets in non-independent and non-identically distributed settings, demonstrating the feasibility of our architectures. Our simulations reveal notable reductions in communication/computation costs and training delays as compared to conventional FL and SL.
LGSep 3, 2025Code
Hierarchical Federated Foundation Models over Wireless Networks for Multi-Modal Multi-Task Intelligence: Integration of Edge Learning with D2D/P2P-Enabled Fog Learning ArchitecturesPayam Abdisarabshali, Fardis Nadimi, Kasra Borazjani et al.
The rise of foundation models (FMs) has reshaped the landscape of machine learning. As these models continued to grow, leveraging geo-distributed data from wireless devices has become increasingly critical, giving rise to federated foundation models (FFMs). More recently, FMs have evolved into multi-modal multi-task (M3T) FMs (e.g., GPT-4) capable of processing diverse modalities across multiple tasks, which motivates a new underexplored paradigm: M3T FFMs. In this paper, we unveil an unexplored variation of M3T FFMs by proposing hierarchical federated foundation models (HF-FMs), which in turn expose two overlooked heterogeneity dimensions to fog/edge networks that have a direct impact on these emerging models: (i) heterogeneity in collected modalities and (ii) heterogeneity in executed tasks across fog/edge nodes. HF-FMs strategically align the modular structure of M3T FMs, comprising modality encoders, prompts, mixture-of-experts (MoEs), adapters, and task heads, with the hierarchical nature of fog/edge infrastructures. Moreover, HF-FMs enable the optional usage of device-to-device (D2D) communications, enabling horizontal module relaying and localized cooperative training among nodes when feasible. Through delving into the architectural design of HF-FMs, we highlight their unique capabilities along with a series of tailored future research directions. Finally, to demonstrate their potential, we prototype HF-FMs in a wireless network setting and release the open-source code for the development of HF-FMs with the goal of fostering exploration in this untapped field (GitHub: https://github.com/payamsiabd/M3T-FFM).
45.9NIMar 19
Masking Intent, Sustaining Equilibrium: Risk-Aware Potential Game-empowered Two-Stage Mobile CrowdsensingHouyi Qi, Minghui Liwang, Kaiwen Tan et al.
Beyond data collection, future mobile crowdsensing (MCS) in complex applications must satisfy diverse requirements, including reliable task completion, budget and quality constraints, and fluctuating worker availability. Besides raw-data and location privacy, workers' intent/preference traces can be exploited by an honest-but-curious platform, enabling intent inference from repeated observations and frequency profiling. Meanwhile, worker dropouts and execution uncertainty may cause coverage instability and redundant sensing, while repeated global online re-optimization incurs high interaction overhead and enlarges the observable attack surface. To address these issues, we propose iParts, an intent-preserving and risk-controllable two-stage service provisioning framework for dynamic MCS. In the offline stage, workers report perturbed intent vectors via personalized local differential privacy with memorization/permanent randomization, suppressing frequency-based inference while preserving decision utility. Using only perturbed intents, the platform builds a redundancy-aware quality model and performs risk-aware pre-planning under budget, individual rationality, quality-failure risk, and intent-mismatch risk constraints. We formulate offline pre-planning as an exact potential game with expected social welfare as the potential function, ensuring a constrained pure-strategy Nash equilibrium and finite-step convergence under asynchronous feasible improvements. In the online stage, when runtime dynamics cause quality deficits, a temporary-recruitment potential game over idle/standby workers enables lightweight remediation with bounded interaction rounds and low observability. Experiments show that iParts achieves a favorable privacy-utility-efficiency trade-off, improving welfare and task completion while reducing redundancy and communication overhead compared with representative baselines.
77.4NIMay 4
Risk-Budgeted Online Scheduling for Continuous Edge Inference over Evolving Time HorizonsHouyi Qi, Minghui Liwang, Sai Zou et al.
Continuous edge inference necessitates not merely low per-timeslot latency, but sustained timeliness guarantees in the presence of time-varying channels, fluctuating edge workloads, and coupled bandwidth-computing resource constraints. Existing studies predominantly optimize instantaneous delay or per-timeslot utility, while largely overlooking the regulation of cross-time deadline violation dynamics in continuous services. To address this, we propose AEGIS, a prediction-empowered risk-budgeted online scheduling framework for continuous edge inference. AEGIS models deadline-violation tendency as an updatable cross-time control state through dynamic user-level risk budgets, so that online scheduling accounts for both instantaneous efficiency and long-term service stability. To support proactive decision making, AEGIS leverages LSTM-based short-term state prediction to construct a smooth deadline-violation risk surrogate, and formulates the resulting time-wise resource competition as a potential-aligned game under coupled feasibility constraints. An asynchronous online algorithm is then developed with finite-step convergence. Experiments demonstrate that AEGIS improves the timely inference ratio, reduces average violation risk and violation burst length, and achieves a favorable delay--risk--convergence trade-off over representative baselines.
28.8NIMay 4
Forecasting-Driven Stable Successor Matching for UAV-Assisted Continuous Edge ServicesHouyi Qi, Minghui Liwang, Yuhan Su et al.
Continuous and reliable service support is crucial for emerging latency-sensitive and computation-intensive applications in UAV-assisted edge networks (UENs) due to operational dynamics and environmental uncertainty. Although conventional designs can improve coverage and computing efficiency, they often rely on instantaneous resource optimization or reactive handover, rendering ongoing services vulnerable to non-negligible interruptions when the serving UAV degrades due to mobility, energy depletion, or channel dynamics. To avoid such post-failure recovery, a promising approach is to prepare a successor UAV in advance, i.e., a standby UAV that reserves minimal resources and synchronizes service context for possible takeover. Thus, we consider a dynamic UEN architecture where each mobile user carries an ongoing computing mission requiring persistent service support, while UAVs provide wireless access and computing services under time-varying network dynamics and stringent onboard energy constraints. To facilitate proactive and continuous service provisioning, we propose a forecasting-driven proactive reservation-based continuous service scheduling framework, termed Fresco. In Fresco, an LSTM-based module is first used to predict short-term disruption risks of ongoing missions from historical network observations. Guided by these predictions, an online risk-aware successor matching scheme selects suitable standby UAVs for high-risk missions under delay, resource, and energy constraints, while incorporating minimal communication/computation reservation and lightweight service-context synchronization for efficient takeover preparation. Experiments show that Fresco significantly reduces service interruptions and improves mission continuity over reactive and non-predictive baselines, with only modest reservation overhead.
54.4NIMay 4
Zero-Trust Bilateral Edge Service Trading with Deposit-Refund Regulation for Runtime ComplianceHouyi Qi, Minghui Liwang, Zhipeng Cheng et al.
Privacy-sensitive edge services necessitate optimizing diverse-type resource scheduling to support trustworthy provisioning within a zero-trust security framework. However, existing studies rarely model how runtime compliance jointly affects bilateral clearing, ex-post settlement, and future seller eligibility in dynamic edge markets. To address this issue, we propose ZEBRIS, a zero-trust bilateral edge service trading framework with deposit-refund regulation for privacy-sensitive services. Specifically, edge provisioning is modeled as a trading form of zero-trust-compliant service packages, where the buyer-side effective valuation captures service value, delay penalty, and privacy risk, while the seller-side effective ask incorporates resource and compliance costs. This yields a resource-aware positive-margin bilateral clearing mechanism under shared resource and security constraints. To discipline post-clearing moral hazard, we further design a capped deposit-refund settlement rule based on measurable runtime compliance and update each seller's future security posture according to realized compliance outcomes. ZEBRIS satisfies bilateral individual rationality and no-subsidy weak budget balance. Experiments demonstrate that ZEBRIS improves social welfare and compliance robustness while reducing service delay and privacy-risk-weighted cost over representative baselines.
LGFeb 13, 2025
Towards Seamless Hierarchical Federated Learning under Intermittent Client Participation: A Stagewise Decision-Making MethodologyMinghong Wu, Minghui Liwang, Yuhan Su et al.
Federated Learning (FL) offers a pioneering distributed learning paradigm that enables devices/clients to build a shared global model. This global model is obtained through frequent model transmissions between clients and a central server, which may cause high latency, energy consumption, and congestion over backhaul links. To overcome these drawbacks, Hierarchical Federated Learning (HFL) has emerged, which organizes clients into multiple clusters and utilizes edge nodes (e.g., edge servers) for intermediate model aggregations between clients and the central server. Current research on HFL mainly focus on enhancing model accuracy, latency, and energy consumption in scenarios with a stable/fixed set of clients. However, addressing the dynamic availability of clients -- a critical aspect of real-world scenarios -- remains underexplored. This study delves into optimizing client selection and client-to-edge associations in HFL under intermittent client participation so as to minimize overall system costs (i.e., delay and energy), while achieving fast model convergence. We unveil that achieving this goal involves solving a complex NP-hard problem. To tackle this, we propose a stagewise methodology that splits the solution into two stages, referred to as Plan A and Plan B. Plan A focuses on identifying long-term clients with high chance of participation in subsequent model training rounds. Plan B serves as a backup, selecting alternative clients when long-term clients are unavailable during model training rounds. This stagewise methodology offers a fresh perspective on client selection that can enhance both HFL and conventional FL via enabling low-overhead decision-making processes. Through evaluations on MNIST and CIFAR-10 datasets, we show that our methodology outperforms existing benchmarks in terms of model accuracy and system costs.
LGMar 8, 2025
Adaptive UAV-Assisted Hierarchical Federated Learning: Optimizing Energy, Latency, and Resilience for Dynamic Smart IoTXiaohong Yang, Minghui Liwang, Liqun Fu et al.
Hierarchical Federated Learning (HFL) extends conventional Federated Learning (FL) by introducing intermediate aggregation layers, enabling distributed learning in geographically dispersed environments, particularly relevant for smart IoT systems, such as remote monitoring and battlefield operations, where cellular connectivity is limited. In these scenarios, UAVs serve as mobile aggregators, dynamically connecting terrestrial IoT devices. This paper investigates an HFL architecture with energy-constrained, dynamically deployed UAVs prone to communication disruptions. We propose a novel approach to minimize global training costs by formulating a joint optimization problem that integrates learning configuration, bandwidth allocation, and device-to-UAV association, ensuring timely global aggregation before UAV disconnections and redeployments. The problem accounts for dynamic IoT devices and intermittent UAV connectivity and is NP-hard. To tackle this, we decompose it into three subproblems: \textit{(i)} optimizing learning configuration and bandwidth allocation via an augmented Lagrangian to reduce training costs; \textit{(ii)} introducing a device fitness score based on data heterogeneity (via Kullback-Leibler divergence), device-to-UAV proximity, and computational resources, using a TD3-based algorithm for adaptive device-to-UAV assignment; \textit{(iii)} developing a low-complexity two-stage greedy strategy for UAV redeployment and global aggregator selection, ensuring efficient aggregation despite UAV disconnections. Experiments on diverse real-world datasets validate the approach, demonstrating cost reduction and robust performance under communication disruptions.
LGFeb 22, 2025
Privacy-Aware Joint DNN Model Deployment and Partitioning Optimization for Collaborative Edge Inference ServicesZhipeng Cheng, Xiaoyu Xia, Hong Wang et al.
Edge inference (EI) has emerged as a promising paradigm to address the growing limitations of cloud-based Deep Neural Network (DNN) inference services, such as high response latency, limited scalability, and severe data privacy exposure. However, deploying DNN models on resource-constrained edge devices introduces additional challenges, including limited computation/storage resources, dynamic service demands, and heightened privacy risks. To tackle these issues, this paper presents a novel privacy-aware optimization framework that jointly addresses DNN model deployment, user-server association, and model partitioning, with the goal of minimizing long-term average inference delay under resource and privacy constraints. The problem is formulated as a complex, NP-hard stochastic optimization. To efficiently handle system dynamics and computational complexity, we employ a Lyapunov-based approach to transform the long-term objective into tractable per-slot decisions. Furthermore, we introduce a coalition formation game to enable adaptive user-server association and design a greedy algorithm for model deployment within each coalition. Extensive simulations demonstrate that the proposed algorithm significantly reduces inference delay and consistently satisfies privacy constraints, outperforming state-of-the-art baselines across diverse scenarios.
LGJan 17, 2025
HEART: Achieving Timely Multi-Model Training for Vehicle-Edge-Cloud-Integrated Hierarchical Federated LearningXiaohong Yang, Minghui Liwang, Xianbin Wang et al.
The rapid growth of AI-enabled Internet of Vehicles (IoV) calls for efficient machine learning (ML) solutions that can handle high vehicular mobility and decentralized data. This has motivated the emergence of Hierarchical Federated Learning over vehicle-edge-cloud architectures (VEC-HFL). Nevertheless, one aspect which is underexplored in the literature on VEC-HFL is that vehicles often need to execute multiple ML tasks simultaneously, where this multi-model training environment introduces crucial challenges. First, improper aggregation rules can lead to model obsolescence and prolonged training times. Second, vehicular mobility may result in inefficient data utilization by preventing the vehicles from returning their models to the network edge. Third, achieving a balanced resource allocation across diverse tasks becomes of paramount importance as it majorly affects the effectiveness of collaborative training. We take one of the first steps towards addressing these challenges via proposing a framework for multi-model training in dynamic VEC-HFL with the goal of minimizing global training latency while ensuring balanced training across various tasks-a problem that turns out to be NP-hard. To facilitate timely model training, we introduce a hybrid synchronous-asynchronous aggregation rule. Building on this, we present a novel method called Hybrid Evolutionary And gReedy allocaTion (HEART). The framework operates in two stages: first, it achieves balanced task scheduling through a hybrid heuristic approach that combines improved Particle Swarm Optimization (PSO) and Genetic Algorithms (GA); second, it employs a low-complexity greedy algorithm to determine the training priority of assigned tasks on vehicles. Experiments on real-world datasets demonstrate the superiority of HEART over existing methods.