Mohsen Guizani

CR
h-index116
122papers
9,272citations
Novelty41%
AI Score55

122 Papers

HCJul 28, 2023
Beyond Reality: The Pivotal Role of Generative AI in the Metaverse

Vinay Chamola, Gaurang Bansal, Tridib Kumar Das et al.

Imagine stepping into a virtual world that's as rich, dynamic, and interactive as our physical one. This is the promise of the Metaverse, and it's being brought to life by the transformative power of Generative Artificial Intelligence (AI). This paper offers a comprehensive exploration of how generative AI technologies are shaping the Metaverse, transforming it into a dynamic, immersive, and interactive virtual world. We delve into the applications of text generation models like ChatGPT and GPT-3, which are enhancing conversational interfaces with AI-generated characters. We explore the role of image generation models such as DALL-E and MidJourney in creating visually stunning and diverse content. We also examine the potential of 3D model generation technologies like Point-E and Lumirithmic in creating realistic virtual objects that enrich the Metaverse experience. But the journey doesn't stop there. We also address the challenges and ethical considerations of implementing these technologies in the Metaverse, offering insights into the balance between user control and AI automation. This paper is not just a study, but a guide to the future of the Metaverse, offering readers a roadmap to harnessing the power of generative AI in creating immersive virtual worlds.

LGOct 27, 2023
From Generative AI to Generative Internet of Things: Fundamentals, Framework, and Outlooks

Jinbo Wen, Jiangtian Nie, Jiawen Kang et al.

Generative Artificial Intelligence (GAI) possesses the capabilities of generating realistic data and facilitating advanced decision-making. By integrating GAI into modern Internet of Things (IoT), Generative Internet of Things (GIoT) is emerging and holds immense potential to revolutionize various aspects of society, enabling more efficient and intelligent IoT applications, such as smart surveillance and voice assistants. In this article, we present the concept of GIoT and conduct an exploration of its potential prospects. Specifically, we first overview four GAI techniques and investigate promising GIoT applications. Then, we elaborate on the main challenges in enabling GIoT and propose a general GAI-based secure incentive mechanism framework to address them, in which we adopt Generative Diffusion Models (GDMs) for incentive mechanism designs and apply blockchain technologies for secure GIoT management. Moreover, we conduct a case study on modern Internet of Vehicle traffic monitoring, which utilizes GDMs to generate effective contracts for incentivizing users to contribute sensing data with high quality. Finally, we suggest several open directions worth investigating for the future popularity of GIoT.

LGOct 24, 2022Code
Energy Pricing in P2P Energy Systems Using Reinforcement Learning

Nicolas Avila, Shahad Hardan, Elnura Zhalieva et al.

The increase in renewable energy on the consumer side gives place to new dynamics in the energy grids. Participants in a microgrid can produce energy and trade it with their peers (peer-to-peer) with the permission of the energy provider. In such a scenario, the stochastic nature of distributed renewable energy generators and energy consumption increases the complexity of defining fair prices for buying and selling energy. In this study, we introduce a reinforcement learning framework to help solve this issue by training an agent to set the prices that maximize the profit of all components in the microgrid, aiming to facilitate the implementation of P2P grids in real-life scenarios. The microgrid considers consumers, prosumers, the service provider, and a community battery. Experimental results on the \textit{Pymgrid} dataset show a successful approach to price optimization for all components in the microgrid. The proposed framework ensures flexibility to account for the interest of these components, as well as the ratio of consumers and prosumers in the microgrid. The results also examine the effect of changing the capacity of the community battery on the profit of the system. The implementation code is available \href{https://github.com/Artifitialleap-MBZUAI/rl-p2p-price-prediction}{here}.

LGApr 18, 2022
A Practical Cross-Device Federated Learning Framework over 5G Networks

Wenti Yang, Naiyu Wang, Zhitao Guan et al.

The concept of federated learning (FL) was first proposed by Google in 2016. Thereafter, FL has been widely studied for the feasibility of application in various fields due to its potential to make full use of data without compromising the privacy. However, limited by the capacity of wireless data transmission, the employment of federated learning on mobile devices has been making slow progress in practical. The development and commercialization of the 5th generation (5G) mobile networks has shed some light on this. In this paper, we analyze the challenges of existing federated learning schemes for mobile devices and propose a novel cross-device federated learning framework, which utilizes the anonymous communication technology and ring signature to protect the privacy of participants while reducing the computation overhead of mobile devices participating in FL. In addition, our scheme implements a contribution-based incentive mechanism to encourage mobile users to participate in FL. We also give a case study of autonomous driving. Finally, we present the performance evaluation of the proposed scheme and discuss some open issues in federated learning.

LGAug 27, 2022
RL-DistPrivacy: Privacy-Aware Distributed Deep Inference for low latency IoT systems

Emna Baccour, Aiman Erbad, Amr Mohamed et al.

Although Deep Neural Networks (DNN) have become the backbone technology of several ubiquitous applications, their deployment in resource-constrained machines, e.g., Internet of Things (IoT) devices, is still challenging. To satisfy the resource requirements of such a paradigm, collaborative deep inference with IoT synergy was introduced. However, the distribution of DNN networks suffers from severe data leakage. Various threats have been presented, including black-box attacks, where malicious participants can recover arbitrary inputs fed into their devices. Although many countermeasures were designed to achieve privacy-preserving DNN, most of them result in additional computation and lower accuracy. In this paper, we present an approach that targets the security of collaborative deep inference via re-thinking the distribution strategy, without sacrificing the model performance. Particularly, we examine different DNN partitions that make the model susceptible to black-box threats and we derive the amount of data that should be allocated per device to hide proprieties of the original input. We formulate this methodology, as an optimization, where we establish a trade-off between the latency of co-inference and the privacy-level of data. Next, to relax the optimal solution, we shape our approach as a Reinforcement Learning (RL) design that supports heterogeneous devices as well as multiple DNNs/datasets.

PFJan 18, 2018
LCD: Low Latency Command Dissemination for A Platoon of Vehicles

Kai Li, Wei Ni, Eduardo Tovar et al.

In a vehicular platoon, a lead vehicle that is responsible for managing the platoon's moving directions and velocity periodically disseminates control commands to following vehicles based on vehicle-to-vehicle communications. However, reducing command dissemination latency with multiple vehicles while ensuring successful message delivery to the tail vehicle is challenging. We propose a new linear dynamic programming algorithm using backward induction and interchange arguments to minimize the dissemination latency of the vehicles. Furthermore, a closed form of dissemination latency in vehicular platoon is obtained by utilizing Markov chain with M/M/1 queuing model. Simulation results confirm that the proposed dynamic programming algorithm improves the dissemination rate by at least 50.9%, compared to similar algorithms in the literature. Moreover, it also approximates the best performance with the maximum gap of up to 0.2 second in terms of latency.

HCJul 13, 2023
Towards Ubiquitous Semantic Metaverse: Challenges, Approaches, and Opportunities

Kai Li, Billy Pik Lik Lau, Xin Yuan et al.

In recent years, ubiquitous semantic Metaverse has been studied to revolutionize immersive cyber-virtual experiences for augmented reality (AR) and virtual reality (VR) users, which leverages advanced semantic understanding and representation to enable seamless, context-aware interactions within mixed-reality environments. This survey focuses on the intelligence and spatio-temporal characteristics of four fundamental system components in ubiquitous semantic Metaverse, i.e., artificial intelligence (AI), spatio-temporal data representation (STDR), semantic Internet of Things (SIoT), and semantic-enhanced digital twin (SDT). We thoroughly survey the representative techniques of the four fundamental system components that enable intelligent, personalized, and context-aware interactions with typical use cases of the ubiquitous semantic Metaverse, such as remote education, work and collaboration, entertainment and socialization, healthcare, and e-commerce marketing. Furthermore, we outline the opportunities for constructing the future ubiquitous semantic Metaverse, including scalability and interoperability, privacy and security, performance measurement and standardization, as well as ethical considerations and responsible AI. Addressing those challenges is important for creating a robust, secure, and ethically sound system environment that offers engaging immersive experiences for the users and AR/VR applications.

ROJun 15, 2023
Motion Comfort Optimization for Autonomous Vehicles: Concepts, Methods, and Techniques

Mohammed Aledhari, Mohamed Rahouti, Junaid Qadir et al.

This article outlines the architecture of autonomous driving and related complementary frameworks from the perspective of human comfort. The technical elements for measuring Autonomous Vehicle (AV) user comfort and psychoanalysis are listed here. At the same time, this article introduces the technology related to the structure of automatic driving and the reaction time of automatic driving. We also discuss the technical details related to the automatic driving comfort system, the response time of the AV driver, the comfort level of the AV, motion sickness, and related optimization technologies. The function of the sensor is affected by various factors. Since the sensor of automatic driving mainly senses the environment around a vehicle, including "the weather" which introduces the challenges and limitations of second-hand sensors in autonomous vehicles under different weather conditions. The comfort and safety of autonomous driving are also factors that affect the development of autonomous driving technologies. This article further analyzes the impact of autonomous driving on the user's physical and psychological states and how the comfort factors of autonomous vehicles affect the automotive market. Also, part of our focus is on the benefits and shortcomings of autonomous driving. The goal is to present an exhaustive overview of the most relevant technical matters to help researchers and application developers comprehend the different comfort factors and systems of autonomous driving. Finally, we provide detailed automated driving comfort use cases to illustrate the comfort-related issues of autonomous driving. Then, we provide implications and insights for the future of autonomous driving.

CYApr 18, 2023
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions

Hani Sami, Ahmad Hammoud, Mouhamad Arafeh et al.

The Metaverse offers a second world beyond reality, where boundaries are non-existent, and possibilities are endless through engagement and immersive experiences using the virtual reality (VR) technology. Many disciplines can benefit from the advancement of the Metaverse when accurately developed, including the fields of technology, gaming, education, art, and culture. Nevertheless, developing the Metaverse environment to its full potential is an ambiguous task that needs proper guidance and directions. Existing surveys on the Metaverse focus only on a specific aspect and discipline of the Metaverse and lack a holistic view of the entire process. To this end, a more holistic, multi-disciplinary, in-depth, and academic and industry-oriented review is required to provide a thorough study of the Metaverse development pipeline. To address these issues, we present in this survey a novel multi-layered pipeline ecosystem composed of (1) the Metaverse computing, networking, communications and hardware infrastructure, (2) environment digitization, and (3) user interactions. For every layer, we discuss the components that detail the steps of its development. Also, for each of these components, we examine the impact of a set of enabling technologies and empowering domains (e.g., Artificial Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on its advancement. In addition, we explain the importance of these technologies to support decentralization, interoperability, user experiences, interactions, and monetization. Our presented study highlights the existing challenges for each component, followed by research directions and potential solutions. To the best of our knowledge, this survey is the most comprehensive and allows users, scholars, and entrepreneurs to get an in-depth understanding of the Metaverse ecosystem to find their opportunities and potentials for contribution.

AINov 5, 2022
ON-DEMAND-FL: A Dynamic and Efficient Multi-Criteria Federated Learning Client Deployment Scheme

Mario Chahoud, Hani Sami, Azzam Mourad et al.

In this paper, we increase the availability and integration of devices in the learning process to enhance the convergence of federated learning (FL) models. To address the issue of having all the data in one location, federated learning, which maintains the ability to learn over decentralized data sets, combines privacy and technology. Until the model converges, the server combines the updated weights obtained from each dataset over a number of rounds. The majority of the literature suggested client selection techniques to accelerate convergence and boost accuracy. However, none of the existing proposals have focused on the flexibility to deploy and select clients as needed, wherever and whenever that may be. Due to the extremely dynamic surroundings, some devices are actually not available to serve as clients in FL, which affects the availability of data for learning and the applicability of the existing solution for client selection. In this paper, we address the aforementioned limitations by introducing an On-Demand-FL, a client deployment approach for FL, offering more volume and heterogeneity of data in the learning process. We make use of the containerization technology such as Docker to build efficient environments using IoT and mobile devices serving as volunteers. Furthermore, Kubernetes is used for orchestration. The Genetic algorithm (GA) is used to solve the multi-objective optimization problem due to its evolutionary strategy. The performed experiments using the Mobile Data Challenge (MDC) dataset and the Localfed framework illustrate the relevance of the proposed approach and the efficiency of the on-the-fly deployment of clients whenever and wherever needed with less discarded rounds and more available data.

LGOct 31, 2022
FedMint: Intelligent Bilateral Client Selection in Federated Learning with Newcomer IoT Devices

Osama Wehbi, Sarhad Arisdakessian, Omar Abdel Wahab et al.

Federated Learning (FL) is a novel distributed privacy-preserving learning paradigm, which enables the collaboration among several participants (e.g., Internet of Things devices) for the training of machine learning models. However, selecting the participants that would contribute to this collaborative training is highly challenging. Adopting a random selection strategy would entail substantial problems due to the heterogeneity in terms of data quality, and computational and communication resources across the participants. Although several approaches have been proposed in the literature to overcome the problem of random selection, most of these approaches follow a unilateral selection strategy. In fact, they base their selection strategy on only the federated server's side, while overlooking the interests of the client devices in the process. To overcome this problem, we present in this paper FedMint, an intelligent client selection approach for federated learning on IoT devices using game theory and bootstrapping mechanism. Our solution involves the design of: (1) preference functions for the client IoT devices and federated servers to allow them to rank each other according to several factors such as accuracy and price, (2) intelligent matching algorithms that take into account the preferences of both parties in their design, and (3) bootstrapping technique that capitalizes on the collaboration of multiple federated servers in order to assign initial accuracy value for the newly connected IoT devices. Based on our simulation findings, our strategy surpasses the VanillaFL selection approach in terms of maximizing both the revenues of the client devices and accuracy of the global federated learning model.

LGFeb 28, 2023
Semi-decentralized Inference in Heterogeneous Graph Neural Networks for Traffic Demand Forecasting: An Edge-Computing Approach

Mahmoud Nazzal, Abdallah Khreishah, Joyoung Lee et al.

Prediction of taxi service demand and supply is essential for improving customer's experience and provider's profit. Recently, graph neural networks (GNNs) have been shown promising for this application. This approach models city regions as nodes in a transportation graph and their relations as edges. GNNs utilize local node features and the graph structure in the prediction. However, more efficient forecasting can still be achieved by following two main routes; enlarging the scale of the transportation graph, and simultaneously exploiting different types of nodes and edges in the graphs. However, both approaches are challenged by the scalability of GNNs. An immediate remedy to the scalability challenge is to decentralize the GNN operation. However, this creates excessive node-to-node communication. In this paper, we first characterize the excessive communication needs for the decentralized GNN approach. Then, we propose a semi-decentralized approach utilizing multiple cloudlets, moderately sized storage and computation devices, that can be integrated with the cellular base stations. This approach minimizes inter-cloudlet communication thereby alleviating the communication overhead of the decentralized approach while promoting scalability due to cloudlet-level decentralization. Also, we propose a heterogeneous GNN-LSTM algorithm for improved taxi-level demand and supply forecasting for handling dynamic taxi graphs where nodes are taxis. Extensive experiments over real data show the advantage of the semi-decentralized approach as tested over our heterogeneous GNN-LSTM algorithm. Also, the proposed semi-decentralized GNN approach is shown to reduce the overall inference time by about an order of magnitude compared to centralized and decentralized inference schemes.

LGMar 5, 2023
Sparsity-Aware Intelligent Massive Random Access Control in Open RAN: A Reinforcement Learning Based Approach

Xiao Tang, Sicong Liu, Xiaojiang Du et al.

Massive random access of devices in the emerging Open Radio Access Network (O-RAN) brings great challenge to the access control and management. Exploiting the bursting nature of the access requests, sparse active user detection (SAUD) is an efficient enabler towards efficient access management, but the sparsity might be deteriorated in case of uncoordinated massive access requests. To dynamically preserve the sparsity of access requests, a reinforcement-learning (RL)-assisted scheme of closed-loop access control utilizing the access class barring technique is proposed, where the RL policy is determined through continuous interaction between the RL agent, i.e., a next generation node base (gNB), and the environment. The proposed scheme can be implemented by the near-real-time RAN intelligent controller (near-RT RIC) in O-RAN, supporting rapid switching between heterogeneous vertical applications, such as mMTC and uRLLC services. Moreover, a data-driven scheme of deep-RL-assisted SAUD is proposed to resolve highly complex environments with continuous and high-dimensional state and action spaces, where a replay buffer is applied for automatic large-scale data collection. An actor-critic framework is formulated to incorporate the strategy-learning modules into the near-RT RIC. Simulation results show that the proposed schemes can achieve superior performance in both access efficiency and user detection accuracy over the benchmark scheme for different heterogeneous services with massive access requests.

DCAug 7, 2024
A Blockchain-based Reliable Federated Meta-learning for Metaverse: A Dual Game Framework

Emna Baccour, Aiman Erbad, Amr Mohamed et al.

The metaverse, envisioned as the next digital frontier for avatar-based virtual interaction, involves high-performance models. In this dynamic environment, users' tasks frequently shift, requiring fast model personalization despite limited data. This evolution consumes extensive resources and requires vast data volumes. To address this, meta-learning emerges as an invaluable tool for metaverse users, with federated meta-learning (FML), offering even more tailored solutions owing to its adaptive capabilities. However, the metaverse is characterized by users heterogeneity with diverse data structures, varied tasks, and uneven sample sizes, potentially undermining global training outcomes due to statistical difference. Given this, an urgent need arises for smart coalition formation that accounts for these disparities. This paper introduces a dual game-theoretic framework for metaverse services involving meta-learners as workers to manage FML. A blockchain-based cooperative coalition formation game is crafted, grounded on a reputation metric, user similarity, and incentives. We also introduce a novel reputation system based on users' historical contributions and potential contributions to present tasks, leveraging correlations between past and new tasks. Finally, a Stackelberg game-based incentive mechanism is presented to attract reliable workers to participate in meta-learning, minimizing users' energy costs, increasing payoffs, boosting FML efficacy, and improving metaverse utility. Results show that our dual game framework outperforms best-effort, random, and non-uniform clustering schemes - improving training performance by up to 10%, cutting completion times by as much as 30%, enhancing metaverse utility by more than 25%, and offering up to 5% boost in training efficiency over non-blockchain systems, effectively countering misbehaving users.

CRFeb 20, 2023
An Incremental Gray-box Physical Adversarial Attack on Neural Network Training

Rabiah Al-qudah, Moayad Aloqaily, Bassem Ouni et al.

Neural networks have demonstrated remarkable success in learning and solving complex tasks in a variety of fields. Nevertheless, the rise of those networks in modern computing has been accompanied by concerns regarding their vulnerability to adversarial attacks. In this work, we propose a novel gradient-free, gray box, incremental attack that targets the training process of neural networks. The proposed attack, which implicitly poisons the intermediate data structures that retain the training instances between training epochs acquires its high-risk property from attacking data structures that are typically unobserved by professionals. Hence, the attack goes unnoticed despite the damage it can cause. Moreover, the attack can be executed without the attackers' knowledge of the neural network structure or training data making it more dangerous. The attack was tested under a sensitive application of secure cognitive cities, namely, biometric authentication. The conducted experiments showed that the proposed attack is effective and stealthy. Finally, the attack effectiveness property was concluded from the fact that it was able to flip the sign of the loss gradient in the conducted experiments to become positive, which indicated noisy and unstable training. Moreover, the attack was able to decrease the inference probability in the poisoned networks compared to their unpoisoned counterparts by 15.37%, 14.68%, and 24.88% for the Densenet, VGG, and Xception, respectively. Finally, the attack retained its stealthiness despite its high effectiveness. This was demonstrated by the fact that the attack did not cause a notable increase in the training time, in addition, the Fscore values only dropped by an average of 1.2%, 1.9%, and 1.5% for the poisoned Densenet, VGG, and Xception, respectively.

LGNov 10, 2022
Warmup and Transfer Knowledge-Based Federated Learning Approach for IoT Continuous Authentication

Mohamad Wazzeh, Hakima Ould-Slimane, Chamseddine Talhi et al.

Continuous behavioural authentication methods add a unique layer of security by allowing individuals to verify their unique identity when accessing a device. Maintaining session authenticity is now feasible by monitoring users' behaviour while interacting with a mobile or Internet of Things (IoT) device, making credential theft and session hijacking ineffective. Such a technique is made possible by integrating the power of artificial intelligence and Machine Learning (ML). Most of the literature focuses on training machine learning for the user by transmitting their data to an external server, subject to private user data exposure to threats. In this paper, we propose a novel Federated Learning (FL) approach that protects the anonymity of user data and maintains the security of his data. We present a warmup approach that provides a significant accuracy increase. In addition, we leverage the transfer learning technique based on feature extraction to boost the models' performance. Our extensive experiments based on four datasets: MNIST, FEMNIST, CIFAR-10 and UMDAA-02-FD, show a significant increase in user authentication accuracy while maintaining user privacy and data security.

68.1LGMar 30Code
NeiGAD: Augmenting Graph Anomaly Detection via Spectral Neighbor Information

Qing Qing, Huafei Huang, Mingliang Hou et al.

Graph anomaly detection (GAD) aims to identify irregular nodes or structures in attributed graphs. Neighbor information, which reflects both structural connectivity and attribute consistency with surrounding nodes, is essential for distinguishing anomalies from normal patterns. Although recent graph neural network (GNN)-based methods incorporate such information through message passing, they often fail to explicitly model its effect or interaction with attributes, limiting detection performance. This work introduces NeiGAD, a novel plug-and-play module that captures neighbor information through spectral graph analysis. Theoretical insights demonstrate that eigenvectors of the adjacency matrix encode local neighbor interactions and progressively amplify anomaly signals. Based on this, NeiGAD selects a compact set of eigenvectors to construct efficient and discriminative representations. Experiments on eight real-world datasets show that NeiGAD consistently improves detection accuracy and outperforms state-of-the-art GAD methods. These results demonstrate the importance of explicit neighbor modeling and the effectiveness of spectral analysis in anomaly detection. Code is available at: https://github.com/huafeihuang/NeiGAD.

MLJul 1, 2024
Probabilistic Conformal Prediction with Approximate Conditional Validity

Vincent Plassier, Alexander Fishkov, Mohsen Guizani et al.

We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution $P_{Y \mid X}$. Existing methods, such as conformalized quantile regression and probabilistic conformal prediction, usually provide only a marginal coverage guarantee. In contrast, our approach extends these frameworks to achieve approximately conditional coverage, which is crucial for many practical applications. Our prediction sets adapt to the behavior of the predictive distribution, making them effective even under high heteroscedasticity. While exact conditional guarantees are infeasible without assumptions on the underlying data distribution, we derive non-asymptotic bounds that depend on the total variation distance of the conditional distribution and its estimate. Using extensive simulations, we show that our method consistently outperforms existing approaches in terms of conditional coverage, leading to more reliable statistical inference in a variety of applications.

CVJul 10, 2024
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Raza Imam, Mohammed Talha Alam, Umaima Rahman et al.

Existing vision-text contrastive learning models enhance representation transferability and support zero-shot prediction by matching paired image and caption embeddings while pushing unrelated pairs apart. However, astronomical image-label datasets are significantly smaller compared to general image and label datasets available from the internet. We introduce CosmoCLIP, an astronomical image-text contrastive learning framework precisely fine-tuned on the pre-trained CLIP model using SpaceNet and BLIP-based captions. SpaceNet, attained via FLARE, constitutes ~13k optimally distributed images, while BLIP acts as a rich knowledge extractor. The rich semantics derived from this SpaceNet and BLIP descriptions, when learned contrastively, enable CosmoCLIP to achieve superior generalization across various in-domain and out-of-domain tasks. Our results demonstrate that CosmoCLIP is a straightforward yet powerful framework, significantly outperforming CLIP in zero-shot classification and image-text retrieval tasks.

CVJul 9, 2024
AstroSpy: On detecting Fake Images in Astronomy via Joint Image-Spectral Representations

Mohammed Talha Alam, Raza Imam, Mohsen Guizani et al.

The prevalence of AI-generated imagery has raised concerns about the authenticity of astronomical images, especially with advanced text-to-image models like Stable Diffusion producing highly realistic synthetic samples. Existing detection methods, primarily based on convolutional neural networks (CNNs) or spectral analysis, have limitations when used independently. We present AstroSpy, a hybrid model that integrates both spectral and image features to distinguish real from synthetic astronomical images. Trained on a unique dataset of real NASA images and AI-generated fakes (approximately 18k samples), AstroSpy utilizes a dual-pathway architecture to fuse spatial and spectral information. This approach enables AstroSpy to achieve superior performance in identifying authentic astronomical images. Extensive evaluations demonstrate AstroSpy's effectiveness and robustness, significantly outperforming baseline models in both in-domain and cross-domain tasks, highlighting its potential to combat misinformation in astronomy.

CVMay 22, 2024Code
FLARE up your data: Diffusion-based Augmentation Method in Astronomical Imaging

Mohammed Talha Alam, Raza Imam, Mohsen Guizani et al.

The intersection of Astronomy and AI encounters significant challenges related to issues such as noisy backgrounds, lower resolution (LR), and the intricate process of filtering and archiving images from advanced telescopes like the James Webb. Given the dispersion of raw images in feature space, we have proposed a \textit{two-stage augmentation framework} entitled as \textbf{FLARE} based on \underline{f}eature \underline{l}earning and \underline{a}ugmented \underline{r}esolution \underline{e}nhancement. We first apply lower (LR) to higher resolution (HR) conversion followed by standard augmentations. Secondly, we integrate a diffusion approach to synthetically generate samples using class-concatenated prompts. By merging these two stages using weighted percentiles, we realign the feature space distribution, enabling a classification model to establish a distinct decision boundary and achieve superior generalization on various in-domain and out-of-domain tasks. We conducted experiments on several downstream cosmos datasets and on our optimally distributed \textbf{SpaceNet} dataset across 8-class fine-grained and 4-class macro classification tasks. FLARE attains the highest performance gain of 20.78\% for fine-grained tasks compared to similar baselines, while across different classification models, FLARE shows a consistent increment of an average of +15\%. This outcome underscores the effectiveness of the FLARE method in enhancing the precision of image classification, ultimately bolstering the reliability of astronomical research outcomes. % Our code and SpaceNet dataset will be released to the public soon. Our code and SpaceNet dataset is available at \href{https://github.com/Razaimam45/PlanetX_Dxb}{\textit{https://github.com/Razaimam45/PlanetX\_Dxb}}.

MLSep 26, 2025Code
Multidimensional Uncertainty Quantification via Optimal Transport

Nikita Kotelevskii, Maiya Goloburda, Vladimir Kondratyev et al.

Most uncertainty quantification (UQ) approaches provide a single scalar value as a measure of model reliability. However, different uncertainty measures could provide complementary information on the prediction confidence. Even measures targeting the same type of uncertainty (e.g., ensemble-based and density-based measures of epistemic uncertainty) may capture different failure modes. We take a multidimensional view on UQ by stacking complementary UQ measures into a vector. Such vectors are assigned with Monge-Kantorovich ranks produced by an optimal-transport-based ordering method. The prediction is then deemed more uncertain than the other if it has a higher rank. The resulting VecUQ-OT algorithm uses entropy-regularized optimal transport. The transport map is learned on vectors of scores from in-distribution data and, by design, applies to unseen inputs, including out-of-distribution cases, without retraining. Our framework supports flexible non-additive uncertainty fusion (including aleatoric and epistemic components). It yields a robust ordering for downstream tasks such as selective prediction, misclassification detection, out-of-distribution detection, and selective generation. Across synthetic, image, and text data, VecUQ-OT shows high efficiency even when individual measures fail. The code for the method is available at: https://github.com/stat-ml/multidimensional_uncertainty.

DCApr 7, 2025Code
Prima.cpp: Fast 30-70B LLM Inference on Heterogeneous and Low-Resource Home Clusters

Zonghang Li, Tao Li, Wenjiao Feng et al.

On-device inference offers privacy, offline use, and instant response, but consumer hardware restricts large language models (LLMs) to low throughput and capability. To overcome this challenge, we present prima.cpp, a distributed on-device inference system that runs 30-70B LLMs on consumer home clusters with mixed CPUs/GPUs, insufficient RAM/VRAM, slow disks, Wi-Fi links, and heterogeneous OSs. We introduce pipelined-ring parallelism (PRP) to overlap disk I/O with compute and communication, and address the prefetch-release conflict in mmap-based offloading. We further propose Halda, a heterogeneity-aware scheduler that co-optimizes per-device CPU/GPU workloads and device selection under RAM/VRAM constraints. On four consumer home devices, a 70B model reaches 674 ms/token TPOT with <6% memory pressure, and a 32B model with speculative decoding achieves 26 tokens/s. Compared with llama.cpp, exo, and dllama, our proposed prima.cpp achieves 5-17x lower TPOT, supports fine-grained model sizes from 8B to 70B, ensures broader cross-OS and quantization compatibility, and remains OOM-free, while also being Wi-Fi tolerant, privacy-preserving, and hardware-independent. The code is available at https://gitee.com/zonghang-li/prima.cpp.

AIJun 14, 2024Code
Efficient Prompting for LLM-based Generative Internet of Things

Bin Xiao, Burak Kantarci, Jiawen Kang et al.

Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently. Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting. However, open-source LLMs usually have more limitations regarding their performance, such as their arithmetic calculation and reasoning capacities, and practical systems of applying LLMs to IoT have yet to be well-explored. Therefore, we propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study. To alleviate the limitations of LLMs and provide service with competitive performance, we apply prompt engineering methods to enhance the capacities of the open-source LLMs, design a Prompt Management Module and a Post-processing Module to manage the tailored prompts for different tasks and process the results generated by the LLMs. To demonstrate the effectiveness of the proposed system, we discuss a challenging Table Question Answering (Table-QA) task as a case study of the proposed system, as tabular data is usually more challenging than plain text because of their complex structures, heterogeneous data types and sometimes huge sizes. We conduct comprehensive experiments on two popular Table-QA datasets, and the results show that our proposal can achieve competitive performance compared with state-of-the-art LLMs, demonstrating that the proposed LLM-based GIoT system can provide competitive performance with tailored prompting methods and is easily extensible to new tasks without training.

ITJun 5, 2024Code
CSI-GPT: Integrating Generative Pre-Trained Transformer with Federated-Tuning to Acquire Downlink Massive MIMO Channels

Ye Zeng, Li Qiao, Zhen Gao et al.

In massive multiple-input multiple-output (MIMO) systems, how to reliably acquire downlink channel state information (CSI) with low overhead is challenging. In this work, by integrating the generative pre-trained Transformer (GPT) with federated-tuning, we propose a CSI-GPT approach to realize efficient downlink CSI acquisition. Specifically, we first propose a Swin Transformer-based channel acquisition network (SWTCAN) to acquire downlink CSI, where pilot signals, downlink channel estimation, and uplink CSI feedback are jointly designed. Furthermore, to solve the problem of insufficient training data, we propose a variational auto-encoder-based channel sample generator (VAE-CSG), which can generate sufficient CSI samples based on a limited number of high-quality CSI data obtained from the current cell. The CSI dataset generated from VAE-CSG will be used for pre-training SWTCAN. To fine-tune the pre-trained SWTCAN for improved performance, we propose an online federated-tuning method, where only a small amount of SWTCAN parameters are unfrozen and updated using over-the-air computation, avoiding the high communication overhead caused by aggregating the complete CSI samples from user equipment (UEs) to the BS for centralized fine-tuning. Simulation results verify the advantages of the proposed SWTCAN and the communication efficiency of the proposed federated-tuning method. Our code is publicly available at https://github.com/BIT-ZY/CSI-GPT

IVJan 29, 2025Code
PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion

Ahmed Sharshar, Yasser Attia, Mohammad Yaqub et al.

Traditional remote spirometry lacks the precision required for effective pulmonary monitoring. We present a novel, non-invasive approach using multimodal predictive models that integrate RGB or thermal video data with patient metadata. Our method leverages energy-efficient Spiking Neural Networks (SNNs) for the regression of Peak Expiratory Flow (PEF) and classification of Forced Expiratory Volume (FEV1) and Forced Vital Capacity (FVC), using lightweight CNNs to overcome SNN limitations in regression tasks. Multimodal data integration is improved with a Multi-Head Attention Layer, and we employ K-Fold validation and ensemble learning to boost robustness. Using thermal data, our SNN models achieve 92% accuracy on a breathing-cycle basis and 99.5% patient-wise. PEF regression models attain Relative RMSEs of 0.11 (thermal) and 0.26 (RGB), with an MAE of 4.52% for FEV1/FVC predictions, establishing state-of-the-art performance. Code and dataset can be found on https://github.com/ahmed-sharshar/RespiroDynamics.git

NIDec 19, 2024
Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Qimei Cui, Xiaohu You, Ni Wei et al.

With the growing demand for seamless connectivity and intelligent communication, the integration of artificial intelligence (AI) and sixth-generation (6G) communication networks has emerged as a transformative paradigm. By embedding AI capabilities across various network layers, this integration enables optimized resource allocation, improved efficiency, and enhanced system robust performance, particularly in intricate and dynamic environments. This paper presents a comprehensive overview of AI and communication for 6G networks, with a focus on emphasizing their foundational principles, inherent challenges, and future research opportunities. We first review the integration of AI and communications in the context of 6G, exploring the driving factors behind incorporating AI into wireless communications, as well as the vision for the convergence of AI and 6G. The discourse then transitions to a detailed exposition of the envisioned integration of AI within 6G networks, delineated across three progressive developmental stages. The first stage, AI for Network, focuses on employing AI to augment network performance, optimize efficiency, and enhance user service experiences. The second stage, Network for AI, highlights the role of the network in facilitating and buttressing AI operations and presents key enabling technologies, such as digital twins for AI and semantic communication. In the final stage, AI as a Service, it is anticipated that future 6G networks will innately provide AI functions as services, supporting application scenarios like immersive communication and intelligent industrial robots. In addition, we conduct an in-depth analysis of the critical challenges faced by the integration of AI and communications in 6G. Finally, we outline promising future research opportunities that are expected to drive the development and refinement of AI and 6G communications.

NIDec 16, 2024
A Survey on Large Language Models for Communication, Network, and Service Management: Application Insights, Challenges, and Future Directions

Gordon Owusu Boateng, Hani Sami, Ahmed Alagha et al.

The rapid evolution of communication networks in recent decades has intensified the need for advanced Network and Service Management (NSM) strategies to address the growing demands for efficiency, scalability, enhanced performance, and reliability of these networks. Large Language Models (LLMs) have received tremendous attention due to their unparalleled capabilities in various Natural Language Processing (NLP) tasks and generating context-aware insights, offering transformative potential for automating diverse communication NSM tasks. Contrasting existing surveys that consider a single network domain, this survey investigates the integration of LLMs across different communication network domains, including mobile networks and related technologies, vehicular networks, cloud-based networks, and fog/edge-based networks. First, the survey provides foundational knowledge of LLMs, explicitly detailing the generic transformer architecture, general-purpose and domain-specific LLMs, LLM model pre-training and fine-tuning, and their relation to communication NSM. Under a novel taxonomy of network monitoring and reporting, AI-powered network planning, network deployment and distribution, and continuous network support, we extensively categorize LLM applications for NSM tasks in each of the different network domains, exploring existing literature and their contributions thus far. Then, we identify existing challenges and open issues, as well as future research directions for LLM-driven communication NSM, emphasizing the need for scalable, adaptable, and resource-efficient solutions that align with the dynamic landscape of communication networks. We envision that this survey serves as a holistic roadmap, providing critical insights for leveraging LLMs to enhance NSM.

CVFeb 11, 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey

Ahmed Sharshar, Latif U. Khan, Waseem Ullah et al.

Vision Large Language Models (VLMs) combine visual understanding with natural language processing, enabling tasks like image captioning, visual question answering, and video analysis. While VLMs show impressive capabilities across domains such as autonomous vehicles, smart surveillance, and healthcare, their deployment on resource-constrained edge devices remains challenging due to processing power, memory, and energy limitations. This survey explores recent advancements in optimizing VLMs for edge environments, focusing on model compression techniques, including pruning, quantization, knowledge distillation, and specialized hardware solutions that enhance efficiency. We provide a detailed discussion of efficient training and fine-tuning methods, edge deployment challenges, and privacy considerations. Additionally, we discuss the diverse applications of lightweight VLMs across healthcare, environmental monitoring, and autonomous systems, illustrating their growing impact. By highlighting key design strategies, current challenges, and offering recommendations for future directions, this survey aims to inspire further research into the practical deployment of VLMs, ultimately making advanced AI accessible in resource-limited settings.

CVOct 25, 2024
GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing

Hosam Elgendy, Ahmed Sharshar, Ahmed Aboeitta et al.

Detecting temporal changes in geographical landscapes is critical for applications like environmental monitoring and urban planning. While remote sensing data is abundant, existing vision-language models (VLMs) often fail to capture temporal dynamics effectively. This paper addresses these limitations by introducing an annotated dataset of video frame pairs to track evolving geographical patterns over time. Using fine-tuning techniques like Low-Rank Adaptation (LoRA), quantized LoRA (QLoRA), and model pruning on models such as Video-LLaVA and LLaVA-NeXT-Video, we significantly enhance VLM performance in processing remote sensing temporal changes. Results show significant improvements, with the best performance achieving a BERT score of 0.864 and ROUGE-1 score of 0.576, demonstrating superior accuracy in describing land-use transformations.

ROJan 13
Large Multimodal Models for Embodied Intelligent Driving: The Next Frontier in Self-Driving?

Long Zhang, Yuchen Xia, Bingqing Wei et al.

The advent of Large Multimodal Models (LMMs) offers a promising technology to tackle the limitations of modular design in autonomous driving, which often falters in open-world scenarios requiring sustained environmental understanding and logical reasoning. Besides, embodied artificial intelligence facilitates policy optimization through closed-loop interactions to achieve the continuous learning capability, thereby advancing autonomous driving toward embodied intelligent (El) driving. However, such capability will be constrained by relying solely on LMMs to enhance EI driving without joint decision-making. This article introduces a novel semantics and policy dual-driven hybrid decision framework to tackle this challenge, ensuring continuous learning and joint decision. The framework merges LMMs for semantic understanding and cognitive representation, and deep reinforcement learning (DRL) for real-time policy optimization. We starts by introducing the foundational principles of EI driving and LMMs. Moreover, we examine the emerging opportunities this framework enables, encompassing potential benefits and representative use cases. A case study is conducted experimentally to validate the performance superiority of our framework in completing lane-change planning task. Finally, several future research directions to empower EI driving are identified to guide subsequent work.

MLFeb 22, 2025
Rectifying Conformity Scores for Better Conditional Coverage

Vincent Plassier, Alexander Fishkov, Victor Dheur et al.

We present a new method for generating confidence sets within the split conformal prediction framework. Our method performs a trainable transformation of any given conformity score to improve conditional coverage while ensuring exact marginal coverage. The transformation is based on an estimate of the conditional quantile of conformity scores. The resulting method is particularly beneficial for constructing adaptive confidence sets in multi-output problems where standard conformal quantile regression approaches have limited applicability. We develop a theoretical bound that captures the influence of the accuracy of the quantile estimate on the approximate conditional validity, unlike classical bounds for conformal prediction methods that only offer marginal coverage. We experimentally show that our method is highly adaptive to the local data structure and outperforms existing methods in terms of conditional coverage, improving the reliability of statistical inference in various applications.

GTMay 1, 2024
Enhancing Mutual Trustworthiness in Federated Learning for Data-Rich Smart Cities

Osama Wehbi, Sarhad Arisdakessian, Mohsen Guizani et al.

Federated learning is a promising collaborative and privacy-preserving machine learning approach in data-rich smart cities. Nevertheless, the inherent heterogeneity of these urban environments presents a significant challenge in selecting trustworthy clients for collaborative model training. The usage of traditional approaches, such as the random client selection technique, poses several threats to the system's integrity due to the possibility of malicious client selection. Primarily, the existing literature focuses on assessing the trustworthiness of clients, neglecting the crucial aspect of trust in federated servers. To bridge this gap, in this work, we propose a novel framework that addresses the mutual trustworthiness in federated learning by considering the trust needs of both the client and the server. Our approach entails: (1) Creating preference functions for servers and clients, allowing them to rank each other based on trust scores, (2) Establishing a reputation-based recommendation system leveraging multiple clients to assess newly connected servers, (3) Assigning credibility scores to recommending devices for better server trustworthiness measurement, (4) Developing a trust assessment mechanism for smart devices using a statistical Interquartile Range (IQR) method, (5) Designing intelligent matching algorithms considering the preferences of both parties. Based on simulation and experimental results, our approach outperforms baseline methods by increasing trust levels, global model accuracy, and reducing non-trustworthy clients in the system.

CRMay 1, 2024
Trust Driven On-Demand Scheme for Client Deployment in Federated Learning

Mario Chahoud, Azzam Mourad, Hadi Otrok et al.

Containerization technology plays a crucial role in Federated Learning (FL) setups, expanding the pool of potential clients and ensuring the availability of specific subsets for each learning iteration. However, doubts arise about the trustworthiness of devices deployed as clients in FL scenarios, especially when container deployment processes are involved. Addressing these challenges is important, particularly in managing potentially malicious clients capable of disrupting the learning process or compromising the entire model. In our research, we are motivated to integrate a trust element into the client selection and model deployment processes within our system architecture. This is a feature lacking in the initial client selection and deployment mechanism of the On-Demand architecture. We introduce a trust mechanism, named "Trusted-On-Demand-FL", which establishes a relationship of trust between the server and the pool of eligible clients. Utilizing Docker in our deployment strategy enables us to monitor and validate participant actions effectively, ensuring strict adherence to agreed-upon protocols while strengthening defenses against unauthorized data access or tampering. Our simulations rely on a continuous user behavior dataset, deploying an optimization model powered by a genetic algorithm to efficiently select clients for participation. By assigning trust values to individual clients and dynamically adjusting these values, combined with penalizing malicious clients through decreased trust scores, our proposed framework identifies and isolates harmful clients. This approach not only reduces disruptions to regular rounds but also minimizes instances of round dismissal, Consequently enhancing both system stability and security.

CRFeb 6, 2025
Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses

Mohammed Aledhari, Rehma Razzak, Mohamed Rahouti et al.

The advancements in autonomous driving technology, coupled with the growing interest from automotive manufacturers and tech companies, suggest a rising adoption of Connected Autonomous Vehicles (CAVs) in the near future. Despite some evidence of higher accident rates in AVs, these incidents tend to result in less severe injuries compared to traditional vehicles due to cooperative safety measures. However, the increased complexity of CAV systems exposes them to significant security vulnerabilities, potentially compromising their performance and communication integrity. This paper contributes by presenting a detailed analysis of existing security frameworks and protocols, focusing on intra- and inter-vehicle communications. We systematically evaluate the effectiveness of these frameworks in addressing known vulnerabilities and propose a set of best practices for enhancing CAV communication security. The paper also provides a comprehensive taxonomy of attack vectors in CAV ecosystems and suggests future research directions for designing more robust security mechanisms. Our key contributions include the development of a new classification system for CAV security threats, the proposal of practical security protocols, and the introduction of use cases that demonstrate how these protocols can be integrated into real-world CAV applications. These insights are crucial for advancing secure CAV adoption and ensuring the safe integration of autonomous vehicles into intelligent transportation systems.

CVJan 21, 2025
UAV-Assisted Real-Time Disaster Detection Using Optimized Transformer Model

Branislava Jankovic, Sabina Jangirova, Waseem Ullah et al.

Dangerous surroundings and difficult-to-reach landscapes introduce significant complications for adequate disaster management and recuperation. These problems can be solved by engaging unmanned aerial vehicles (UAVs) provided with embedded platforms and optical sensors. In this work, we focus on enabling onboard aerial image processing to ensure proper and real-time disaster detection. Such a setting usually causes challenges due to the limited hardware resources of UAVs. However, privacy, connectivity, and latency issues can be avoided. We suggest a UAV-assisted edge framework for disaster detection, leveraging our proposed model optimized for onboard real-time aerial image classification. The optimization of the model is achieved using post-training quantization techniques. To address the limited number of disaster cases in existing benchmark datasets and therefore ensure real-world adoption of our model, we construct a novel dataset, DisasterEye, featuring disaster scenes captured by UAVs and individuals on-site. Experimental results reveal the efficacy of our model, reaching high accuracy with lowered inference latency and memory use on both traditional machines and resource-limited devices. This shows that the scalability and adaptability of our method make it a powerful solution for real-time disaster management on resource-constrained UAV platforms.

53.1NEApr 7
Constraint-Driven Warm-Freeze for Efficient Transfer Learning in Photovoltaic Systems

Yasmeen Saeed, Ahmed Sharshar, Mohsen Guizani

Detecting cyberattacks in photovoltaic (PV) monitoring and MPPT control signals requires models that are robust to bias, drift, and transient spikes, yet lightweight enough for resource-constrained edge controllers. While deep learning outperforms traditional physics-based diagnostics and handcrafted features, standard fine-tuning is computationally prohibitive for edge devices. Furthermore, existing Parameter-Efficient Fine-Tuning (PEFT) methods typically apply uniform adaptation or rely on expensive architectural searches, lacking the flexibility to adhere to strict hardware budgets. To bridge this gap, we propose Constraint-Driven Warm-Freeze (CDWF), a budget-aware adaptation framework. CDWF leverages a brief warm-start phase to quantify gradient-based block importance, then solves a constrained optimization problem to dynamically allocate full trainability to high-impact blocks while efficiently adapting the remaining blocks via Low-Rank Adaptation (LoRA). We evaluate CDWF on standard vision benchmarks (CIFAR-10/100) and a novel PV cyberattack dataset, transferring from bias pretraining to drift and spike detection. The experiments demonstrate that CDWF retains 90 to 99% of full fine-tuning performance while reducing trainable parameters by up to 120x. These results establish CDWF as an effective, importance-guided solution for reliable transfer learning under tight edge constraints.

LGSep 30, 2025
Uncertainty Quantification for Regression using Proper Scoring Rules

Alexander Fishkov, Kajetan Schweighofer, Mykyta Ielanskyi et al.

Quantifying uncertainty of machine learning model predictions is essential for reliable decision-making, especially in safety-critical applications. Recently, uncertainty quantification (UQ) theory has advanced significantly, building on a firm basis of learning with proper scoring rules. However, these advances were focused on classification, while extending these ideas to regression remains challenging. In this work, we introduce a unified UQ framework for regression based on proper scoring rules, such as CRPS, logarithmic, squared error, and quadratic scores. We derive closed-form expressions for the resulting uncertainty measures under practical parametric assumptions and show how to estimate them using ensembles of models. In particular, the derived uncertainty measures naturally decompose into aleatoric and epistemic components. The framework recovers popular regression UQ measures based on predictive variance and differential entropy. Our broad evaluation on synthetic and real-world regression datasets provides guidance for selecting reliable UQ measures.

MLMay 21, 2025
Adaptive Set-Mass Calibration with Conformal Prediction

Daniil Kazantsev, Mohsen Guizani, Eric Moulines et al.

Reliable probabilities are critical in high-risk applications, yet common calibration criteria (confidence, class-wise) are only necessary for full distributional calibration, and post-hoc methods often lack distribution-free guarantees. We propose a set-based notion of calibration, cumulative mass calibration, and a corresponding empirical error measure: the Cumulative Mass Calibration Error (CMCE). We develop a new calibration procedure that starts with conformal prediction to obtain a set of labels that gives the desired coverage. We then instantiate two simple post-hoc calibrators: a mass normalization and a temperature scaling-based rule, tuned to the conformal constraint. On multi-class image benchmarks, especially with a large number of classes, our methods consistently improve CMCE and standard metrics (ECE, cw-ECE, MCE) over baselines, delivering a practical, scalable framework with theoretical guarantees.

NIFeb 26, 2025
A Multi-Agent DRL-Based Framework for Optimal Resource Allocation and Twin Migration in the Multi-Tier Vehicular Metaverse

Nahom Abishu Hayla, A. Mohammed Seid, Aiman Erbad et al.

Although multi-tier vehicular Metaverse promises to transform vehicles into essential nodes -- within an interconnected digital ecosystem -- using efficient resource allocation and seamless vehicular twin (VT) migration, this can hardly be achieved by the existing techniques operating in a highly dynamic vehicular environment, since they can hardly balance multi-objective optimization problems such as latency reduction, resource utilization, and user experience (UX). To address these challenges, we introduce a novel multi-tier resource allocation and VT migration framework that integrates Graph Convolutional Networks (GCNs), a hierarchical Stackelberg game-based incentive mechanism, and Multi-Agent Deep Reinforcement Learning (MADRL). The GCN-based model captures both spatial and temporal dependencies within the vehicular network; the Stackelberg game-based incentive mechanism fosters cooperation between vehicles and infrastructure; and the MADRL algorithm jointly optimizes resource allocation and VT migration in real time. By modeling this dynamic and multi-tier vehicular Metaverse as a Markov Decision Process (MDP), we develop a MADRL-based algorithm dubbed the Multi-Objective Multi-Agent Deep Deterministic Policy Gradient (MO-MADDPG), which can effectively balances the various conflicting objectives. Extensive simulations validate the effectiveness of this algorithm that is demonstrated to enhance scalability, reliability, and efficiency while considerably improving latency, resource utilization, migration cost, and overall UX by 12.8%, 9.7%, 14.2%, and 16.1%, respectively.

CVAug 14, 2025
ChatENV: An Interactive Vision-Language Model for Sensor-Guided Environmental Monitoring and Scenario Simulation

Hosam Elgendy, Ahmed Sharshar, Ahmed Aboeitta et al.

Understanding environmental changes from aerial imagery is vital for climate resilience, urban planning, and ecosystem monitoring. Yet, current vision language models (VLMs) overlook causal signals from environmental sensors, rely on single-source captions prone to stylistic bias, and lack interactive scenario-based reasoning. We present ChatENV, the first interactive VLM that jointly reasons over satellite image pairs and real-world sensor data. Our framework: (i) creates a 177k-image dataset forming 152k temporal pairs across 62 land-use classes in 197 countries with rich sensor metadata (e.g., temperature, PM10, CO); (ii) annotates data using GPT- 4o and Gemini 2.0 for stylistic and semantic diversity; and (iii) fine-tunes Qwen-2.5-VL using efficient Low-Rank Adaptation (LoRA) adapters for chat purposes. ChatENV achieves strong performance in temporal and "what-if" reasoning (e.g., BERT-F1 0.903) and rivals or outperforms state-of-the-art temporal models, while supporting interactive scenario-based analysis. This positions ChatENV as a powerful tool for grounded, sensor-aware environmental monitoring.

CVJul 28, 2025
Not Only Grey Matter: OmniBrain for Robust Multimodal Classification of Alzheimer's Disease

Ahmed Sharshar, Yasser Ashraf, Tameem Bakr et al.

Alzheimer's disease affects over 55 million people worldwide and is projected to more than double by 2050, necessitating rapid, accurate, and scalable diagnostics. However, existing approaches are limited because they cannot achieve clinically acceptable accuracy, generalization across datasets, robustness to missing modalities, and explainability all at the same time. This inability to satisfy all these requirements simultaneously undermines their reliability in clinical settings. We propose OmniBrain, a multimodal framework that integrates brain MRI, radiomics, gene expression, and clinical data using a unified model with cross-attention and modality dropout. OmniBrain achieves $92.2 \pm 2.4\%$accuracy on the ANMerge dataset and generalizes to the MRI-only ADNI dataset with $70.4 \pm 2.7\%$ accuracy, outperforming unimodal and prior multimodal approaches. Explainability analyses highlight neuropathologically relevant brain regions and genes, enhancing clinical trust. OmniBrain offers a robust, interpretable, and practical solution for real-world Alzheimer's diagnosis.

CLJul 25, 2025
Towards Inclusive NLP: Assessing Compressed Multilingual Transformers across Diverse Language Benchmarks

Maitha Alshehhi, Ahmed Sharshar, Mohsen Guizani

Although LLMs have attained significant success in high-resource languages, their capacity in low-resource linguistic environments like Kannada and Arabic is not yet fully understood. This work benchmarking the performance of multilingual and monolingual Large Language Models (LLMs) across Arabic, English, and Indic languages, with particular emphasis on the effects of model compression strategies such as pruning and quantization. Findings shows significant performance differences driven by linguistic diversity and resource availability on SOTA LLMS as BLOOMZ, AceGPT, Jais, LLaMA-2, XGLM, and AraGPT2. We find that multilingual versions of the model outperform their language-specific counterparts across the board, indicating substantial cross-lingual transfer benefits. Quantization (4-bit and 8-bit) is effective in maintaining model accuracy while promoting efficiency, but aggressive pruning significantly compromises performance, especially in bigger models. Our findings pinpoint key strategies to construct scalable and fair multilingual NLP solutions and underscore the need for interventions to address hallucination and generalization errors in the low-resource setting.

LGJul 12, 2025
Heterogeneous Graph Prompt Learning via Adaptive Weight Pruning

Chu-Yuan Wei, Shun-Yao Liu, Sheng-Da Zhuo et al.

Graph Neural Networks (GNNs) have achieved remarkable success in various graph-based tasks (e.g., node classification or link prediction). Despite their triumphs, GNNs still face challenges such as long training and inference times, difficulty in capturing complex relationships, and insufficient feature extraction. To tackle these issues, graph pre-training and graph prompt methods have garnered increasing attention for their ability to leverage large-scale datasets for initial learning and task-specific adaptation, offering potential improvements in GNN performance. However, previous research has overlooked the potential of graph prompts in optimizing models, as well as the impact of both positive and negative graph prompts on model stability and efficiency. To bridge this gap, we propose a novel framework combining graph prompts with weight pruning, called GPAWP, which aims to enhance the performance and efficiency of graph prompts by using fewer of them. We evaluate the importance of graph prompts using an importance assessment function to determine positive and negative weights at different granularities. Through hierarchically structured pruning, we eliminate negative prompt labels, resulting in more parameter-efficient and competitively performing prompts. Extensive experiments on three benchmark datasets demonstrate the superiority of GPAWP, leading to a significant reduction in parameters in node classification tasks.

DCMay 19, 2025
Learning In Chaos: Efficient Autoscaling and Self-Healing for Multi-Party Distributed Training

Wenjiao Feng, Rongxing Xiao, Zonghang Li et al.

Node and link churn in multi-party, cross-region clusters over wide-area networks (WANs) often disrupts distributed training. However, checkpoint-based recovery and cloud-centric autoscaling react slowly and assume centralized control, which is misaligned with the self-governed setup where institutions can freely join and leave. This paper proposes Chaos, a multi-party distributed training system with self-healing and autoscaling, enabling robust and elastic training under churn. It speeds up autoscaling via multi-neighbor state replication and model sharding. We formalize the sharding and assignment as a MINLP that captures WAN heterogeneity, and reduce it to a tractable MILP by analyzing its monotonicity on a divisibility chain. By establishing an equivalence, we derive a greedy algorithm that follows optimality rules and yields the optimal solution in polynomial time. Chaos uses a cluster monitor to track resource and topology changes, and handles scaling events through peer negotiation protocols, enabling fully self-governed autoscaling among institutions. Experiments show that Chaos has substantially lower scale-out delay than Pollux, Elan, and Autoscaling, and handles scale-in, connect-link, and disconnect-link events within 20ms. It also delivers the lowest idle time, showing superior resource use and scalability as the cluster grows.

CVFeb 28, 2025
Real-Time Aerial Fire Detection on Resource-Constrained Devices Using Knowledge Distillation

Sabina Jangirova, Branislava Jankovic, Waseem Ullah et al.

Wildfire catastrophes cause significant environmental degradation, human losses, and financial damage. To mitigate these severe impacts, early fire detection and warning systems are crucial. Current systems rely primarily on fixed CCTV cameras with a limited field of view, restricting their effectiveness in large outdoor environments. The fusion of intelligent fire detection with remote sensing improves coverage and mobility, enabling monitoring in remote and challenging areas. Existing approaches predominantly utilize convolutional neural networks and vision transformer models. While these architectures provide high accuracy in fire detection, their computational complexity limits real-time performance on edge devices such as UAVs. In our work, we present a lightweight fire detection model based on MobileViT-S, compressed through the distillation of knowledge from a stronger teacher model. The ablation study highlights the impact of a teacher model and the chosen distillation technique on the model's performance improvement. We generate activation map visualizations using Grad-CAM to confirm the model's ability to focus on relevant fire regions. The high accuracy and efficiency of the proposed model make it well-suited for deployment on satellites, UAVs, and IoT devices for effective fire detection. Experiments on common fire benchmarks demonstrate that our model suppresses the state-of-the-art model by 0.44%, 2.00% while maintaining a compact model size. Our model delivers the highest processing speed among existing works, achieving real-time performance on resource-constrained devices.

LGDec 4, 2024
BGTplanner: Maximizing Training Accuracy for Differentially Private Federated Recommenders via Strategic Privacy Budget Allocation

Xianzhi Zhang, Yipeng Zhou, Miao Hu et al.

To mitigate the rising concern about privacy leakage, the federated recommender (FR) paradigm emerges, in which decentralized clients co-train the recommendation model without exposing their raw user-item rating data. The differentially private federated recommender (DPFR) further enhances FR by injecting differentially private (DP) noises into clients. Yet, current DPFRs, suffering from noise distortion, cannot achieve satisfactory accuracy. Various efforts have been dedicated to improving DPFRs by adaptively allocating the privacy budget over the learning process. However, due to the intricate relation between privacy budget allocation and model accuracy, existing works are still far from maximizing DPFR accuracy. To address this challenge, we develop BGTplanner (Budget Planner) to strategically allocate the privacy budget for each round of DPFR training, improving overall training performance. Specifically, we leverage the Gaussian process regression and historical information to predict the change in recommendation accuracy with a certain allocated privacy budget. Additionally, Contextual Multi-Armed Bandit (CMAB) is harnessed to make privacy budget allocation decisions by reconciling the current improvement and long-term privacy constraints. Our extensive experimental results on real datasets demonstrate that \emph{BGTplanner} achieves an average improvement of 6.76\% in training performance compared to state-of-the-art baselines.

SPJun 16, 2024
Multi-UAV Multi-RIS QoS-Aware Aerial Communication Systems using DRL and PSO

Marwan Dhuheir, Aiman Erbad, Ala Al-Fuqaha et al.

Recently, Unmanned Aerial Vehicles (UAVs) have attracted the attention of researchers in academia and industry for providing wireless services to ground users in diverse scenarios like festivals, large sporting events, natural and man-made disasters due to their advantages in terms of versatility and maneuverability. However, the limited resources of UAVs (e.g., energy budget and different service requirements) can pose challenges for adopting UAVs for such applications. Our system model considers a UAV swarm that navigates an area, providing wireless communication to ground users with RIS support to improve the coverage of the UAVs. In this work, we introduce an optimization model with the aim of maximizing the throughput and UAVs coverage through optimal path planning of UAVs and multi-RIS phase configurations. The formulated optimization is challenging to solve using standard linear programming techniques, limiting its applicability in real-time decision-making. Therefore, we introduce a two-step solution using deep reinforcement learning and particle swarm optimization. We conduct extensive simulations and compare our approach to two competitive solutions presented in the recent literature. Our simulation results demonstrate that our adopted approach is 20 \% better than the brute-force approach and 30\% better than the baseline solution in terms of QoS.

LGMay 12, 2024
On-Demand Model and Client Deployment in Federated Learning with Deep Reinforcement Learning

Mario Chahoud, Hani Sami, Azzam Mourad et al.

In Federated Learning (FL), the limited accessibility of data from diverse locations and user types poses a significant challenge due to restricted user participation. Expanding client access and diversifying data enhance models by incorporating diverse perspectives, thereby enhancing adaptability. However, challenges arise in dynamic and mobile environments where certain devices may become inaccessible as FL clients, impacting data availability and client selection methods. To address this, we propose an On-Demand solution, deploying new clients using Docker Containers on-the-fly. Our On-Demand solution, employing Deep Reinforcement Learning (DRL), targets client availability and selection, while considering data shifts, and container deployment complexities. It employs an autonomous end-to-end solution for handling model deployment and client selection. The DRL strategy uses a Markov Decision Process (MDP) framework, with a Master Learner and a Joiner Learner. The designed cost functions represent the complexity of the dynamic client deployment and selection. Simulated tests show that our architecture can easily adjust to changes in the environment and respond to On-Demand requests. This underscores its ability to improve client availability, capability, accuracy, and learning efficiency, surpassing heuristic and tabular reinforcement learning solutions.

LGFeb 18, 2022
PerFED-GAN: Personalized Federated Learning via Generative Adversarial Networks

Xingjian Cao, Gang Sun, Hongfang Yu et al.

Federated learning is gaining popularity as a distributed machine learning method that can be used to deploy AI-dependent IoT applications while protecting client data privacy and security. Due to the differences of clients, a single global model may not perform well on all clients, so the personalized federated learning method, which trains a personalized model for each client that better suits its individual needs, becomes a research hotspot. Most personalized federated learning research, however, focuses on data heterogeneity while ignoring the need for model architecture heterogeneity. Most existing federated learning methods uniformly set the model architecture of all clients participating in federated learning, which is inconvenient for each client's individual model and local data distribution requirements, and also increases the risk of client model leakage. This paper proposes a federated learning method based on co-training and generative adversarial networks(GANs) that allows each client to design its own model to participate in federated learning training independently without sharing any model architecture or parameter information with other clients or a center. In our experiments, the proposed method outperforms the existing methods in mean test accuracy by 42% when the client's model architecture and data distribution vary significantly.