NIJan 14, 2023
Multi-armed Bandit Learning for TDMA Transmission Slot Scheduling and Defragmentation for Improved Bandwidth UsageHrishikesh Dutta, Amit Kumar Bhuyan, Subir Biswas
This paper proposes a Time Division Multiple Access (TDMA) MAC slot allocation protocol with efficient bandwidth usage in wireless sensor networks and Internet of Things (IoTs). The developed protocol has two primary components: a Multi-Armed Bandits (MAB)-based slot allocation mechanism for collision free transmission, and a Decentralized Defragmented Slot Backshift (DDSB) operation for improving bandwidth usage efficiency. The proposed framework is decentralized in that each node finds its transmission schedule independently without the control of any centralized arbitrator. The developed mechanism is suitable for networks with or without time synchronization, thus, making it suitable for low-complexity wireless transceivers for wireless sensor and IoT nodes. This framework is able to manage the trade-off between learning convergence time and bandwidth. In addition, it allows the nodes to adapt to topological changes while maintaining efficient bandwidth usage. The developed logic is tested for both fully-connected and arbitrary mesh networks with extensive simulation experiments. It is shown how the nodes can learn to select collision-free transmission slots using MAB. Moreover, the nodes learn to self-adjust their transmission schedules using a novel DDSB framework in order to reduce bandwidth usage.
NIJan 14, 2023
Reinforcement Learning for Protocol Synthesis in Resource-Constrained Wireless Sensor and IoT NetworksHrishikesh Dutta, Amit Kumar Bhuyan, Subir Biswas
This article explores the concepts of online protocol synthesis using Reinforcement Learning (RL). The study is performed in the context of sensor and IoT networks with ultra low complexity wireless transceivers. The paper introduces the use of RL and Multi Armed Bandit (MAB), a specific type of RL, for Medium Access Control (MAC) under different network and traffic conditions. It then introduces a novel learning based protocol synthesis framework that addresses specific difficulties and limitations in medium access for both random access and time slotted networks. The mechanism does not rely on carrier sensing, network time-synchronization, collision detection, and other low level complex operations, thus making it ideal for ultra simple transceiver hardware used in resource constrained sensor and IoT networks. Additionally, the ability of independent protocol learning by the nodes makes the system robust and adaptive to the changes in network and traffic conditions. It is shown that the nodes can be trained to learn to avoid collisions, and to achieve network throughputs that are comparable to ALOHA based access protocols in sensor and IoT networks with simplest transceiver hardware. It is also shown that using RL, it is feasible to synthesize access protocols that can sustain network throughput at high traffic loads, which is not feasible in the ALOHA-based systems. The ability of the system to provide throughput fairness under network and traffic heterogeneities are also experimentally demonstrated.
SDApr 16, 2024
Unsupervised Speaker Diarization in Distributed IoT Networks Using Federated LearningAmit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas
This paper presents a computationally efficient and distributed speaker diarization framework for networked IoT-style audio devices. The work proposes a Federated Learning model which can identify the participants in a conversation without the requirement of a large audio database for training. An unsupervised online update mechanism is proposed for the Federated Learning model which depends on cosine similarity of speaker embeddings. Moreover, the proposed diarization system solves the problem of speaker change detection via. unsupervised segmentation techniques using Hotelling's t-squared Statistic and Bayesian Information Criterion. In this new approach, speaker change detection is biased around detected quasi-silences, which reduces the severity of the trade-off between the missed detection and false detection rates. Additionally, the computational overhead due to frame-by-frame identification of speakers is reduced via. unsupervised clustering of speech segments. The results demonstrate the effectiveness of the proposed training method in the presence of non-IID speech data. It also shows a considerable improvement in the reduction of false and missed detection at the segmentation stage, while reducing the computational overhead. Improved accuracy and reduced computational cost makes the mechanism suitable for real-time speaker diarization across a distributed IoT audio network.
LGJan 15, 2025
Towards Federated Multi-Armed Bandit Learning for Content Dissemination using Swarm of UAVsAmit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas
This paper introduces an Unmanned Aerial Vehicle - enabled content management architecture that is suitable for critical content access in communities of users that are communication-isolated during diverse types of disaster scenarios. The proposed architecture leverages a hybrid network of stationary anchor UAVs and mobile Micro-UAVs for ubiquitous content dissemination. The anchor UAVs are equipped with both vertical and lateral communication links, and they serve local users, while the mobile micro-ferrying UAVs extend coverage across communities with increased mobility. The focus is on developing a content dissemination system that dynamically learns optimal caching policies to maximize content availability. The core innovation is an adaptive content dissemination framework based on distributed Federated Multi-Armed Bandit learning. The goal is to optimize UAV content caching decisions based on geo-temporal content popularity and user demand variations. A Selective Caching Algorithm is also introduced to reduce redundant content replication by incorporating inter-UAV information sharing. This method strategically preserves the uniqueness in user preferences while amalgamating the intelligence across a distributed learning system. This approach improves the learning algorithm's ability to adapt to diverse user preferences. Functional verification and performance evaluation confirm the proposed architecture's utility across different network sizes, UAV swarms, and content popularity patterns.
LGApr 16, 2024
Top-k Multi-Armed Bandit Learning for Content Dissemination in Swarms of Micro-UAVsAmit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas
This paper presents a Micro-Unmanned Aerial Vehicle (UAV)-enhanced content management system for disaster scenarios where communication infrastructure is generally compromised. Utilizing a hybrid network of stationary and mobile Micro-UAVs, this system aims to provide crucial content access to isolated communities. In the developed architecture, stationary anchor UAVs, equipped with vertical and lateral links, serve users in individual disaster-affected communities. and mobile micro-ferrying UAVs, with enhanced mobility, extend coverage across multiple such communities. The primary goal is to devise a content dissemination system that dynamically learns caching policies to maximize content accessibility to users left without communication infrastructure. The core contribution is an adaptive content dissemination framework that employs a decentralized Top-k Multi-Armed Bandit learning approach for efficient UAV caching decisions. This approach accounts for geo-temporal variations in content popularity and diverse user demands. Additionally, a Selective Caching Algorithm is proposed to minimize redundant content copies by leveraging inter-UAV information sharing. Through functional verification and performance evaluation, the proposed framework demonstrates improved system performance and adaptability across varying network sizes, micro-UAV swarms, and content popularity distributions.
NIDec 18, 2023
Multi-Armed Bandit Learning for Content Provisioning in Network of UAVsAmit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas
This paper proposes an unmanned aerial vehicle (UAV) aided content management system in communication-challenged disaster scenarios. Without cellular infrastructure in such scenarios, community of stranded users can be provided access to situation-critical contents using a hybrid network of static and traveling UAVs. A set of relatively static anchor UAVs can download content from central servers and provide content access to its local users. A set of ferrying UAVs with wider mobility can provision content to users by shuffling them across different anchor UAVs while visiting different communities of users. The objective is to design a content dissemination system that on-the-fly learns content caching policies for maximizing content availability to the stranded users. This paper proposes a decentralized Top-k Multi-Armed Bandit Learning model for UAV-caching decision-making that takes geo-temporal differences in content popularity and heterogeneity in content demands into consideration. The proposed paradigm is able to combine the expected reward maximization attribute and a proposed multi-dimensional reward structure of Top-k Multi-Armed Bandit, for caching decision at the UAVs. This study is done for different user-specified tolerable access delay, heterogeneous popularity distributions, and inter-community geographical characteristics. Functional verification and performance evaluation of the proposed caching framework is done for a wide range of network size, UAV distribution, and content popularity.
LGApr 29, 2021
Medium Access using Distributed Reinforcement Learning for IoTs with Low-Complexity Wireless TransceiversHrishikesh Dutta, Subir Biswas
This paper proposes a distributed Reinforcement Learning (RL) based framework that can be used for synthesizing MAC layer wireless protocols in IoT networks with low-complexity wireless transceivers. The proposed framework does not rely on complex hardware capabilities such as carrier sensing and its associated algorithmic complexities that are often not supported in wireless transceivers of low-cost and low-energy IoT devices. In this framework, the access protocols are first formulated as Markov Decision Processes (MDP) and then solved using RL. A distributed and multi-Agent RL framework is used as the basis for protocol synthesis. Distributed behavior makes the nodes independently learn optimal transmission strategies without having to rely on full network level information and direct knowledge of behavior of other nodes. The nodes learn to minimize packet collisions such that optimal throughput can be attained and maintained for loading conditions that are higher than what the known benchmark protocols (such as ALOHA) for IoT devices without complex transceivers. In addition, the nodes are observed to be able to learn to act optimally in the presence of heterogeneous loading and network topological conditions. Finally, the proposed learning approach allows the wireless bandwidth to be fairly distributed among network nodes in a way that is not dependent on such heterogeneities. Via simulation experiments, the paper demonstrates the performance of the learning paradigm and its abilities to make nodes adapt their optimal transmission strategies on the fly in response to various network dynamics.
LGFeb 2, 2021
Towards Multi-agent Reinforcement Learning for Wireless Network Protocol SynthesisHrishikesh Dutta, Subir Biswas
This paper proposes a multi-agent reinforcement learning based medium access framework for wireless networks. The access problem is formulated as a Markov Decision Process (MDP), and solved using reinforcement learning with every network node acting as a distributed learning agent. The solution components are developed step by step, starting from a single-node access scenario in which a node agent incrementally learns to control MAC layer packet loads for reining in self-collisions. The strategy is then scaled up for multi-node fully-connected scenarios by using more elaborate reward structures. It also demonstrates preliminary feasibility for more general partially connected topologies. It is shown that by learning to adjust MAC layer transmission probabilities, the protocol is not only able to attain theoretical maximum throughput at an optimal load, but unlike classical approaches, it can also retain that maximum throughput at higher loading conditions. Additionally, the mechanism is agnostic to heterogeneous loading while preserving that feature. It is also shown that access priorities of the protocol across nodes can be parametrically adjusted. Finally, it is also shown that the online learning feature of reinforcement learning is able to make the protocol adapt to time-varying loading conditions.
SPJun 4, 2019
A Natural Language-Inspired Multi-label Video Streaming Traffic Classification Method Based on Deep Neural NetworksYan Shi, Dezhi Feng, Subir Biswas
This paper presents a deep-learning based traffic classification method for identifying multiple streaming video sources at the same time within an encrypted tunnel. The work defines a novel feature inspired by Natural Language Processing (NLP) that allows existing NLP techniques to help the traffic classification. The feature extraction method is described, and a large dataset containing video streaming and web traffic is created to verify its effectiveness. Results are obtained by applying several NLP methods to show that the proposed method performs well on both binary and multilabel traffic classification problems. We also show the ability to achieve zero-shot learning with the proposed method.