LGJun 15, 2023
Joint Path planning and Power Allocation of a Cellular-Connected UAV using Apprenticeship Learning via Deep Inverse Reinforcement LearningAlireza Shamsoshoara, Fatemeh Lotfi, Sajad Mousavi et al.
This paper investigates an interference-aware joint path planning and power allocation mechanism for a cellular-connected unmanned aerial vehicle (UAV) in a sparse suburban environment. The UAV's goal is to fly from an initial point and reach a destination point by moving along the cells to guarantee the required quality of service (QoS). In particular, the UAV aims to maximize its uplink throughput and minimize the level of interference to the ground user equipment (UEs) connected to the neighbor cellular BSs, considering the shortest path and flight resource limitation. Expert knowledge is used to experience the scenario and define the desired behavior for the sake of the agent (i.e., UAV) training. To solve the problem, an apprenticeship learning method is utilized via inverse reinforcement learning (IRL) based on both Q-learning and deep reinforcement learning (DRL). The performance of this method is compared to learning from a demonstration technique called behavioral cloning (BC) using a supervised learning approach. Simulation and numerical results show that the proposed approach can achieve expert-level performance. We also demonstrate that, unlike the BC technique, the performance of our proposed approach does not degrade in unseen situations.
SYAug 30, 2022
Evolutionary Deep Reinforcement Learning for Dynamic Slice Management in O-RANFatemeh Lotfi, Omid Semiari, Fatemeh Afghah
The next-generation wireless networks are required to satisfy a variety of services and criteria concurrently. To address upcoming strict criteria, a new open radio access network (O-RAN) with distinguishing features such as flexible design, disaggregated virtual and programmable components, and intelligent closed-loop control was developed. O-RAN slicing is being investigated as a critical strategy for ensuring network quality of service (QoS) in the face of changing circumstances. However, distinct network slices must be dynamically controlled to avoid service level agreement (SLA) variation caused by rapid changes in the environment. Therefore, this paper introduces a novel framework able to manage the network slices through provisioned resources intelligently. Due to diverse heterogeneous environments, intelligent machine learning approaches require sufficient exploration to handle the harshest situations in a wireless network and accelerate convergence. To solve this problem, a new solution is proposed based on evolutionary-based deep reinforcement learning (EDRL) to accelerate and optimize the slice management learning process in the radio access network's (RAN) intelligent controller (RIC) modules. To this end, the O-RAN slicing is represented as a Markov decision process (MDP) which is then solved optimally for resource allocation to meet service demand using the EDRL approach. In terms of reaching service demands, simulation results show that the proposed approach outperforms the DRL baseline by 62.2%.
DCJun 15, 2023
Attention-based Open RAN Slice Management using Deep Reinforcement LearningFatemeh Lotfi, Fatemeh Afghah, Jonathan Ashdown
As emerging networks such as Open Radio Access Networks (O-RAN) and 5G continue to grow, the demand for various services with different requirements is increasing. Network slicing has emerged as a potential solution to address the different service requirements. However, managing network slices while maintaining quality of services (QoS) in dynamic environments is a challenging task. Utilizing machine learning (ML) approaches for optimal control of dynamic networks can enhance network performance by preventing Service Level Agreement (SLA) violations. This is critical for dependable decision-making and satisfying the needs of emerging networks. Although RL-based control methods are effective for real-time monitoring and controlling network QoS, generalization is necessary to improve decision-making reliability. This paper introduces an innovative attention-based deep RL (ADRL) technique that leverages the O-RAN disaggregated modules and distributed agent cooperation to achieve better performance through effective information extraction and implementing generalization. The proposed method introduces a value-attention network between distributed agents to enable reliable and optimal decision-making. Simulation results demonstrate significant improvements in network performance compared to other DRL baseline methods.
CVFeb 8, 2023
Triplet Loss-less Center Loss Sampling Strategies in Facial Expression Recognition ScenariosHossein Rajoli, Fatemeh Lotfi, Adham Atyabi et al.
Facial expressions convey massive information and play a crucial role in emotional expression. Deep neural network (DNN) accompanied by deep metric learning (DML) techniques boost the discriminative ability of the model in facial expression recognition (FER) applications. DNN, equipped with only classification loss functions such as Cross-Entropy cannot compact intra-class feature variation or separate inter-class feature distance as well as when it gets fortified by a DML supporting loss item. The triplet center loss (TCL) function is applied on all dimensions of the sample's embedding in the embedding space. In our work, we developed three strategies: fully-synthesized, semi-synthesized, and prediction-based negative sample selection strategies. To achieve better results, we introduce a selective attention module that provides a combination of pixel-wise and element-wise attention coefficients using high-semantic deep features of input samples. We evaluated the proposed method on the RAF-DB, a highly imbalanced dataset. The experimental results reveal significant improvements in comparison to the baseline for all three negative sample selection strategies.
NISep 30, 2024
Meta Reinforcement Learning Approach for Adaptive Resource Optimization in O-RANFatemeh Lotfi, Fatemeh Afghah
As wireless networks grow to support more complex applications, the Open Radio Access Network (O-RAN) architecture, with its smart RAN Intelligent Controller (RIC) modules, becomes a crucial solution for real-time network data collection, analysis, and dynamic management of network resources including radio resource blocks and downlink power allocation. Utilizing artificial intelligence (AI) and machine learning (ML), O-RAN addresses the variable demands of modern networks with unprecedented efficiency and adaptability. Despite progress in using ML-based strategies for network optimization, challenges remain, particularly in the dynamic allocation of resources in unpredictable environments. This paper proposes a novel Meta Deep Reinforcement Learning (Meta-DRL) strategy, inspired by Model-Agnostic Meta-Learning (MAML), to advance resource block and downlink power allocation in O-RAN. Our approach leverages O-RAN's disaggregated architecture with virtual distributed units (DUs) and meta-DRL strategies, enabling adaptive and localized decision-making that significantly enhances network efficiency. By integrating meta-learning, our system quickly adapts to new network conditions, optimizing resource allocation in real-time. This results in a 19.8% improvement in network management performance over traditional methods, advancing the capabilities of next-generation wireless networks.
LGJan 12, 2024
Open RAN LSTM Traffic Prediction and Slice Management using Deep Reinforcement LearningFatemeh Lotfi, Fatemeh Afghah
With emerging applications such as autonomous driving, smart cities, and smart factories, network slicing has become an essential component of 5G and beyond networks as a means of catering to a service-aware network. However, managing different network slices while maintaining quality of services (QoS) is a challenge in a dynamic environment. To address this issue, this paper leverages the heterogeneous experiences of distributed units (DUs) in ORAN systems and introduces a novel approach to ORAN slicing xApp using distributed deep reinforcement learning (DDRL). Additionally, to enhance the decision-making performance of the RL agent, a prediction rApp based on long short-term memory (LSTM) is incorporated to provide additional information from the dynamic environment to the xApp. Simulation results demonstrate significant improvements in network performance, particularly in reducing QoS violations. This emphasizes the importance of using the prediction rApp and distributed actors' information jointly as part of a dynamic xApp.
LGMay 31, 2025
ORAN-GUIDE: RAG-Driven Prompt Learning for LLM-Augmented Reinforcement Learning in O-RAN Network SlicingFatemeh Lotfi, Hossein Rajoli, Fatemeh Afghah
Advanced wireless networks must support highly dynamic and heterogeneous service demands. Open Radio Access Network (O-RAN) architecture enables this flexibility by adopting modular, disaggregated components, such as the RAN Intelligent Controller (RIC), Centralized Unit (CU), and Distributed Unit (DU), that can support intelligent control via machine learning (ML). While deep reinforcement learning (DRL) is a powerful tool for managing dynamic resource allocation and slicing, it often struggles to process raw, unstructured input like RF features, QoS metrics, and traffic trends. These limitations hinder policy generalization and decision efficiency in partially observable and evolving environments. To address this, we propose \textit{ORAN-GUIDE}, a dual-LLM framework that enhances multi-agent RL (MARL) with task-relevant, semantically enriched state representations. The architecture employs a domain-specific language model, ORANSight, pretrained on O-RAN control and configuration data, to generate structured, context-aware prompts. These prompts are fused with learnable tokens and passed to a frozen GPT-based encoder that outputs high-level semantic representations for DRL agents. This design adopts a retrieval-augmented generation (RAG) style pipeline tailored for technical decision-making in wireless systems. Experimental results show that ORAN-GUIDE improves sample efficiency, policy convergence, and performance generalization over standard MARL and single-LLM baselines.
LGMay 31, 2025
Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network SlicingFatemeh Lotfi, Hossein Rajoli, Fatemeh Afghah
Modern wireless networks must adapt to dynamic conditions while efficiently managing diverse service demands. Traditional deep reinforcement learning (DRL) struggles in these environments, as scattered and evolving feedback makes optimal decision-making challenging. Large Language Models (LLMs) offer a solution by structuring unorganized network feedback into meaningful latent representations, helping RL agents recognize patterns more effectively. For example, in O-RAN slicing, concepts like SNR, power levels and throughput are semantically related, and LLMs can naturally cluster them, providing a more interpretable state representation. To leverage this capability, we introduce a contextualization-based adaptation method that integrates learnable prompts into an LLM-augmented DRL framework. Instead of relying on full model fine-tuning, we refine state representations through task-specific prompts that dynamically adjust to network conditions. Utilizing ORANSight, an LLM trained on O-RAN knowledge, we develop Prompt-Augmented Multi agent RL (PA-MRL) framework. Learnable prompts optimize both semantic clustering and RL objectives, allowing RL agents to achieve higher rewards in fewer iterations and adapt more efficiently. By incorporating prompt-augmented learning, our approach enables faster, more scalable, and adaptive resource allocation in O-RAN slicing. Experimental results show that it accelerates convergence and outperforms other baselines.
AINov 19, 2025
Task Specific Sharpness Aware O-RAN Resource Management using Multi Agent Reinforcement LearningFatemeh Lotfi, Hossein Rajoli, Fatemeh Afghah
Next-generation networks utilize the Open Radio Access Network (O-RAN) architecture to enable dynamic resource management, facilitated by the RAN Intelligent Controller (RIC). While deep reinforcement learning (DRL) models show promise in optimizing network resources, they often struggle with robustness and generalizability in dynamic environments. This paper introduces a novel resource management approach that enhances the Soft Actor Critic (SAC) algorithm with Sharpness-Aware Minimization (SAM) in a distributed Multi-Agent RL (MARL) framework. Our method introduces an adaptive and selective SAM mechanism, where regularization is explicitly driven by temporal-difference (TD)-error variance, ensuring that only agents facing high environmental complexity are regularized. This targeted strategy reduces unnecessary overhead, improves training stability, and enhances generalization without sacrificing learning efficiency. We further incorporate a dynamic $ρ$ scheduling scheme to refine the exploration-exploitation trade-off across agents. Experimental results show our method significantly outperforms conventional DRL approaches, yielding up to a $22\%$ improvement in resource allocation efficiency and ensuring superior QoS satisfaction across diverse O-RAN slices.
ITNov 23, 2021
Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular NetworksFatemeh Lotfi, Omid Semiari, Walid Saad
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach to enable future intelligent and autonomous systems that rely on real-time decision-making in complex dynamic environments. Nonetheless, in practical scenarios, CDRL faces many challenges due to the heterogeneity of agents and their learning tasks, different environments, time constraints of the learning, and resource limitations of wireless networks. To address these challenges, in this paper, a novel semantic-aware CDRL method is proposed to enable a group of heterogeneous untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network. To this end, a new heterogeneous federated DRL (HFDRL) algorithm is proposed to select the best subset of semantically relevant DRL agents for collaboration. The proposed approach then jointly optimizes the training loss and wireless bandwidth allocation for the cooperating selected agents in order to train each agent within the time limit of its real-time task. Simulation results show the superior performance of the proposed algorithm compared to state-of-the-art baselines.