LGApr 23
Continuous-Utility Direct Preference OptimizationMuhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal et al.
Large language model reasoning is often treated as a monolithic capability, relying on binary preference supervision that fails to capture partial progress or fine-grained reasoning quality. We introduce Continuous Utility Direct Preference Optimization (CU-DPO), a framework that aligns models to a portfolio of prompt-based cognitive strategies by replacing binary labels with continuous scores that capture fine-grained reasoning quality. We prove that learning with K strategies yields a Theta(K log K) improvement in sample complexity over binary preferences, and that DPO converges to the entropy-regularized utility-maximizing policy. To exploit this signal, we propose a two-stage training pipeline: (i) strategy selection, which optimizes the model to choose the best strategy for a given problem via best-vs-all comparisons, and (ii) execution refinement, which trains the model to correctly execute the selected strategy using margin-stratified pairs. On mathematical reasoning benchmarks, CU-DPO improves strategy selection accuracy from 35-46 percent to 68-78 percent across seven base models, yielding consistent downstream reasoning gains of up to 6.6 points on in-distribution datasets with effective transfer to out-of-distribution tasks.
ITApr 13
ISAC-Enabled Non-Terrestrial Networks for 6G: Design Principles, Standardization, Performance Tradeoffs, and Use CasesMuhammad Ali Jamshed, Rohit Singh, Malik Muhammad Saad et al.
Non-Terrestrial Networks (NTN) have emerged as a key enabler to fully realize the vision of integrated, intelligent, and ubiquitous connectivity in 6G systems. However, several operational challenges, including severe Doppler effects, interference, and latency, hinder the seamless integration of NTN and Terrestrial Networks (TN). In this context, Integrated Sensing and Communication (ISAC), which unifies sensing and communication functionalities within a common framework, offers great potential to address these challenges while enabling new network capabilities. Due to its complementary functionalities, ISAC can play a pivotal role in enhancing NTN performance, although its practical adoption requires a fundamental rethinking of existing architectural and standardization frameworks. Motivated by this need, this article examines key aspects of ISAC-enabled NTN, including architectural design principles, application scenarios, standardization challenges, and key performance tradeoffs. Finally, a representative case study is presented to illustrate major technical challenges and highlight promising future research directions for ISAC-enabled NTN.
AIApr 20, 2025
Meta-Thinking in LLMs via Multi-Agent Reinforcement Learning: A SurveyAhsan Bilal, Muhammad Ahmed Mohsin, Muhammad Umer et al.
This survey explores the development of meta-thinking capabilities in Large Language Models (LLMs) from a Multi-Agent Reinforcement Learning (MARL) perspective. Meta-thinking self-reflection, assessment, and control of thinking processes is an important next step in enhancing LLM reliability, flexibility, and performance, particularly for complex or high-stakes tasks. The survey begins by analyzing current LLM limitations, such as hallucinations and the lack of internal self-assessment mechanisms. It then talks about newer methods, including RL from human feedback (RLHF), self-distillation, and chain-of-thought prompting, and each of their limitations. The crux of the survey is to talk about how multi-agent architectures, namely supervisor-agent hierarchies, agent debates, and theory of mind frameworks, can emulate human-like introspective behavior and enhance LLM robustness. By exploring reward mechanisms, self-play, and continuous learning methods in MARL, this survey gives a comprehensive roadmap to building introspective, adaptive, and trustworthy LLMs. Evaluation metrics, datasets, and future research avenues, including neuroscience-inspired architectures and hybrid symbolic reasoning, are also discussed.
IVMar 21, 2025
Vision Transformer Based Semantic Communications for Next Generation Wireless NetworksMuhammad Ahmed Mohsin, Muhammad Jazib, Zeeshan Alam et al.
In the evolving landscape of 6G networks, semantic communications are poised to revolutionize data transmission by prioritizing the transmission of semantic meaning over raw data accuracy. This paper presents a Vision Transformer (ViT)-based semantic communication framework that has been deliberately designed to achieve high semantic similarity during image transmission while simultaneously minimizing the demand for bandwidth. By equipping ViT as the encoder-decoder framework, the proposed architecture can proficiently encode images into a high semantic content at the transmitter and precisely reconstruct the images, considering real-world fading and noise consideration at the receiver. Building on the attention mechanisms inherent to ViTs, our model outperforms Convolution Neural Network (CNNs) and Generative Adversarial Networks (GANs) tailored for generating such images. The architecture based on the proposed ViT network achieves the Peak Signal-to-noise Ratio (PSNR) of 38 dB, which is higher than other Deep Learning (DL) approaches in maintaining semantic similarity across different communication environments. These findings establish our ViT-based approach as a significant breakthrough in semantic communications.
LGJul 13, 2025
Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless NetworksOluwaseyi Giwa, Tobi Awodunmila, Muhammad Ahmed Mohsin et al.
The dynamic allocation of spectrum in 5G / 6G networks is critical to efficient resource utilization. However, applying traditional deep reinforcement learning (DRL) is often infeasible due to its immense sample complexity and the safety risks associated with unguided exploration, which can cause severe network interference. To address these challenges, we propose a meta-learning framework that enables agents to learn a robust initial policy and rapidly adapt to new wireless scenarios with minimal data. We implement three meta-learning architectures, model-agnostic meta-learning (MAML), recurrent neural network (RNN), and an attention-enhanced RNN, and evaluate them against a non-meta-learning DRL algorithm, proximal policy optimization (PPO) baseline, in a simulated dynamic integrated access/backhaul (IAB) environment. Our results show a clear performance gap. The attention-based meta-learning agent reaches a peak mean network throughput of 48 Mbps, while the PPO baseline decreased drastically to 10 Mbps. Furthermore, our method reduces SINR and latency violations by more than 50% compared to PPO. It also shows quick adaptation, with a fairness index 0.7, showing better resource allocation. This work proves that meta-learning is a very effective and safer option for intelligent control in complex wireless systems.
LGNov 17, 2025
On the Fundamental Limits of LLMs at ScaleMuhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal et al.
Large Language Models (LLMs) have benefited enormously from scaling, yet these gains are bounded by five fundamental limitations: (1) hallucination, (2) context compression, (3) reasoning degradation, (4) retrieval fragility, and (5) multimodal misalignment. While existing surveys describe these phenomena empirically, they lack a rigorous theoretical synthesis connecting them to the foundational limits of computation, information, and learning. This work closes that gap by presenting a unified, proof-informed framework that formalizes the innate theoretical ceilings of LLM scaling. First, computability and uncomputability imply an irreducible residue of error: for any computably enumerable model family, diagonalization guarantees inputs on which some model must fail, and undecidable queries (e.g., halting-style tasks) induce infinite failure sets for all computable predictors. Second, information-theoretic and statistical constraints bound attainable accuracy even on decidable tasks, finite description length enforces compression error, and long-tail factual knowledge requires prohibitive sample complexity. Third, geometric and computational effects compress long contexts far below their nominal size due to positional under-training, encoding attenuation, and softmax crowding. We further show how likelihood-based training favors pattern completion over inference, how retrieval under token limits suffers from semantic drift and coupling noise, and how multimodal scaling inherits shallow cross-modal alignment. Across sections, we pair theorems and empirical evidence to outline where scaling helps, where it saturates, and where it cannot progress, providing both theoretical foundations and practical mitigation paths like bounded-oracle retrieval, positional curricula, and sparse or hierarchical attention.
DCAug 27, 2025
Towards 6G Intelligence: The Role of Generative AI in Future Wireless NetworksMuhammad Ahmed Mohsin, Junaid Ahmad, Muhammad Hamza Nawaz et al.
Ambient intelligence (AmI) is a computing paradigm in which physical environments are embedded with sensing, computation, and communication so they can perceive people and context, decide appropriate actions, and respond autonomously. Realizing AmI at global scale requires sixth generation (6G) wireless networks with capabilities for real time perception, reasoning, and action aligned with human behavior and mobility patterns. We argue that Generative Artificial Intelligence (GenAI) is the creative core of such environments. Unlike traditional AI, GenAI learns data distributions and can generate realistic samples, making it well suited to close key AmI gaps, including generating synthetic sensor and channel data in under observed areas, translating user intent into compact, semantic messages, predicting future network conditions for proactive control, and updating digital twins without compromising privacy. This chapter reviews foundational GenAI models, GANs, VAEs, diffusion models, and generative transformers, and connects them to practical AmI use cases, including spectrum sharing, ultra reliable low latency communication, intelligent security, and context aware digital twins. We also examine how 6G enablers, such as edge and fog computing, IoT device swarms, intelligent reflecting surfaces (IRS), and non terrestrial networks, can host or accelerate distributed GenAI. Finally, we outline open challenges in energy efficient on device training, trustworthy synthetic data, federated generative learning, and AmI specific standardization. We show that GenAI is not a peripheral addition, but a foundational element for transforming 6G from a faster network into an ambient intelligent ecosystem.
QUANT-PHJun 18, 2025
QPPG: Quantum-Preconditioned Policy Gradient for Link Adaptation in Rayleigh Fading ChannelsOluwaseyi Giwa, Muhammad Ahmed Mohsin, Folarin Jubril Adesola et al.
Reliable link adaptation is critical for efficient wireless communications in dynamic fading environments. However, reinforcement learning (RL) solutions often suffer from unstable convergence due to poorly conditioned policy gradients, hindering their practical application. We propose the quantum-preconditioned policy gradient (QPPG) algorithm, which leverages Fisher-information-based preconditioning to stabilise and accelerate policy updates. Evaluations in Rayleigh fading scenarios show that QPPG achieves faster convergence, a 28.6% increase in average throughput, and a 43.8% decrease in average transmit power compared to classical methods. This work introduces quantum-geometric conditioning to link adaptation, marking a significant advance in developing robust, quantum-inspired reinforcement learning for future 6G networks, thereby enhancing communication reliability and energy efficiency.