SYOct 11, 2011
Optimal Power Allocation for Renewable Energy SourceAbhinav Sinha, Prasanna Chaporkar
Battery powered transmitters face energy constraint, replenishing their energy by a renewable energy source (like solar or wind power) can lead to longer lifetime. We consider here the problem of finding the optimal power allocation under random channel conditions for a wireless transmitter, such that rate of information transfer is maximized. Here a rechargeable battery, which is periodically charged by renewable source, is used to power the transmitter. All of above is formulated as a Markov Decision Process. Structural properties like the monotonicity of the optimal value and policy derived in this paper will be of vital importance in understanding the kind of algorithms and approximations needed in real-life scenarios. The effect of curse of dimensionality which is prevalent in Dynamic programming problems can thus be reduced. We show our results under the most general of assumptions.
5.4ETApr 20
UAVs as Dynamic Nodes in Communication NetworksRiddhi Apte, Shubhada Gadgil, Gaurav Kasbekar et al.
Driven by the demands of 5G/Beyond 5G and 6G networks, Unmanned Aerial Vehicles (UAVs) have surfaced in critical roles for aerial communications. In the present survey, we explore the multi-mode roles of UAVs as relays, User Equipment (UE), gNB and Reconfigurable Intelligent Surfaces (RIS), along with their deployment scenarios, architectural frameworks, and different communication models incorporating Artificial Intelligence (AI) configurations. We consider the effects of alternate power sources on the communication payload. The survey also aims to address security issues in the UAV communications. As an advancement, we propose a novel UAV-Network-in-a-Box (NIB) architecture for disaster recovery and temporary coverage as an alternative to traditional network infrastructure.
ETDec 3, 2025
AI/ML in 3GPP 5G Advanced - Services and ArchitecturePradnya Taksande, Shwetha Kiran, Pranav Jha et al.
The 3rd Generation Partnership Project (3GPP), the standards body for mobile networks, is in the final phase of Release 19 standardization and is beginning Release 20. Artificial Intelligence/ Machine Learning (AI/ML) has brought about a paradigm shift in technology and it is being adopted across industries and verticals. 3GPP has been integrating AI/ML into the 5G advanced system since Release 18. This paper focuses on the AI/ML related technological advancements and features introduced in Release 19 within the Service and System Aspects (SA) Technical specifications group of 3GPP. The advancements relate to two paradigms: (i) enhancements that AI/ML brought to the 5G advanced system (AI for network), e.g. resource optimization, and (ii) enhancements that were made to the 5G system to support AI/ML applications (Network for AI), e.g. image recognition.
NISep 14, 2025
Energy-Aware 6G Network Design: A SurveyRashmi Kamran, Mahesh Ganesh Bhat, Pranav Jha et al.
6th Generation (6G) mobile networks are envisioned to support several new capabilities and data-centric applications for unprecedented number of users, potentially raising significant energy efficiency and sustainability concerns. This brings focus on sustainability as one of the key objectives in the their design. To move towards sustainable solution, research and standardization community is focusing on several key issues like energy information monitoring and exposure, use of renewable energy, and use of Artificial Intelligence/Machine Learning (AI/ML) for improving the energy efficiency in 6G networks. The goal is to build energy-aware solutions that takes into account the energy information resulting in energy efficient networks. Design of energy-aware 6G networks brings in new challenges like increased overheads in gathering and exposing of energy related information, and the associated user consent management. The aim of this paper is to provide a comprehensive survey of methods used for design of energy efficient 6G networks, like energy harvesting, energy models and parameters, classification of energy-aware services, and AI/ML-based solutions. The survey also includes few use cases that demonstrate the benefits of incorporating energy awareness into network decisions. Several ongoing standardization efforts in 3GPP, ITU, and IEEE are included to provide insights into the ongoing work and highlight the opportunities for new contributions. We conclude this survey with open research problems and challenges that can be explored to make energy-aware design feasible and ensure optimality regarding performance and energy goals for 6G networks.
LGSep 2, 2025
Threshold-Based Optimal Arm Selection in Monotonic Bandits: Regret Lower Bounds and AlgorithmsChanakya Varude, Jay Chaudhary, Siddharth Kaushik et al.
In multi-armed bandit problems, the typical goal is to identify the arm with the highest reward. This paper explores a threshold-based bandit problem, aiming to select an arm based on its relation to a prescribed threshold \(τ\). We study variants where the optimal arm is the first above \(τ\), the \(k^{th}\) arm above or below it, or the closest to it, under a monotonic structure of arm means. We derive asymptotic regret lower bounds, showing dependence only on arms adjacent to \(τ\). Motivated by applications in communication networks (CQI allocation), clinical dosing, energy management, recommendation systems, and more. We propose algorithms with optimality validated through Monte Carlo simulations. Our work extends classical bandit theory with threshold constraints for efficient decision-making.
LGAug 4, 2025
Clus-UCB: A Near-Optimal Algorithm for Clustered BanditsAakash Gore, Prasanna Chaporkar
We study a stochastic multi-armed bandit setting where arms are partitioned into known clusters, such that the mean rewards of arms within a cluster differ by at most a known threshold. While the clustering structure is known a priori, the arm means are unknown. We derive an asymptotic lower bound on the regret that improves upon the classical bound of Lai & Robbins (1985). We then propose Clus-UCB, an efficient algorithm that closely matches this lower bound asymptotically. Clus-UCB is designed to exploit the clustering structure and introduces a new index to evaluate an arm, which depends on other arms within the cluster. In this way, arms share information among each other. We present simulation results of our algorithm and compare its performance against KL-UCB and other wellknown algorithms for bandits with dependent arms. Finally, we address some limitations of this work and conclude by mentioning some possible future research.
LGDec 21, 2019
Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision ProcessesArghyadip Roy, Vivek Borkar, Abhay Karandikar et al.
To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice. Contrary to traditional RL algorithms which do not consider the structural properties of the optimal policy, we propose a structure-aware learning algorithm to exploit the ordered multi-threshold structure of the optimal policy, if any. We prove the asymptotic convergence of the proposed algorithm to the optimal policy. Due to the reduction in the policy space, the proposed algorithm provides remarkable improvements in storage and computational complexities over classical RL algorithms. Simulation results establish that the proposed algorithm converges faster than other RL algorithms.
SPMay 22, 2019
MIST: A Novel Training Strategy for Low-latency Scalable Neural Net DecodersKumar Yashashwi, Deepak Anand, Sibi Raj B Pillai et al.
In this paper, we propose a low latency, robust and scalable neural net based decoder for convolutional and low-density parity-check (LPDC) coding schemes. The proposed decoders are demonstrated to have bit error rate (BER) and block error rate (BLER) performances at par with the state-of-the-art neural net based decoders while achieving more than 8 times higher decoding speed. The enhanced decoding speed is due to the use of convolutional neural network (CNN) as opposed to recurrent neural network (RNN) used in the best known neural net based decoders. This contradicts existing doctrine that only RNN based decoders can provide a performance close to the optimal ones. The key ingredient to our approach is a novel Mixed-SNR Independent Samples based Training (MIST), which allows for training of CNN with only 1\% of possible datawords, even for block length as high as 1000. The proposed decoder is robust as, once trained, the same decoder can be used for a wide range of SNR values. Finally, in the presence of channel outages, the proposed decoders outperform the best known decoders, {\it viz.} unquantized Viterbi decoder for convolutional code, and belief propagation for LDPC. This gives the CNN decoder a significant advantage in 5G millimeter wave systems, where channel outages are prevalent.
LGFeb 7, 2019
KLUCB Approach to Copeland BanditsNischal Agrawal, Prasanna Chaporkar
Multi-armed bandit(MAB) problem is a reinforcement learning framework where an agent tries to maximise her profit by proper selection of actions through absolute feedback for each action. The dueling bandits problem is a variation of MAB problem in which an agent chooses a pair of actions and receives relative feedback for the chosen action pair. The dueling bandits problem is well suited for modelling a setting in which it is not possible to provide quantitative feedback for each action, but qualitative feedback for each action is preferred as in the case of human feedback. The dueling bandits have been successfully applied in applications such as online rank elicitation, information retrieval, search engine improvement and clinical online recommendation. We propose a new method called Sup-KLUCB for K-armed dueling bandit problem specifically Copeland bandit problem by converting it into a standard MAB problem. Instead of using MAB algorithm independently for each action in a pair as in Sparring and in Self-Sparring algorithms, we combine a pair of action and use it as one action. Previous UCB algorithms such as Relative Upper Confidence Bound(RUCB) can be applied only in case of Condorcet dueling bandits, whereas this algorithm applies to general Copeland dueling bandits, including Condorcet dueling bandits as a special case. Our empirical results outperform state of the art Double Thompson Sampling(DTS) in case of Copeland dueling bandits.
LGNov 28, 2018
A Structure-aware Online Learning Algorithm for Markov Decision ProcessesArghyadip Roy, Vivek Borkar, Abhay Karandikar et al.
To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider an infinite-horizon average reward MDP problem and prove the optimality of the threshold policy under certain conditions. Traditional RL techniques do not exploit the threshold nature of optimal policy while learning. In this paper, we propose a new RL algorithm which utilizes the known threshold structure of the optimal policy while learning by reducing the feasible policy space. We establish that the proposed algorithm converges to the optimal policy. It provides a significant improvement in convergence speed and computational and storage complexity over traditional RL algorithms. The proposed technique can be applied to a wide variety of optimization problems that include energy efficient data transmission and management of queues. We exhibit the improvement in convergence speed of the proposed algorithm over other RL algorithms through simulations.