David Gesbert

h-index11

23papers

847citations

Novelty47%

AI Score29

Ranked #150,957 of 201,326 authors (top 75%)#639 in IT (top 74%)

23 Papers

LGJul 1, 2022

Robust Bayesian Learning for Reliable Wireless AI: Framework and Applications

Matteo Zecchin, Sangwoo Park, Osvaldo Simeone et al.

This work takes a critical look at the application of conventional machine learning methods to wireless communication problems through the lens of reliability and robustness. Deep learning techniques adopt a frequentist framework, and are known to provide poorly calibrated decisions that do not reproduce the true uncertainty caused by limitations in the size of the training data. Bayesian learning, while in principle capable of addressing this shortcoming, is in practice impaired by model misspecification and by the presence of outliers. Both problems are pervasive in wireless communication settings, in which the capacity of machine learning models is subject to resource constraints and training data is affected by noise and interference. In this context, we explore the application of the framework of robust Bayesian learning. After a tutorial-style introduction to robust Bayesian learning, we showcase the merits of robust Bayesian learning on several important wireless communication problems in terms of accuracy, calibration, and robustness to outliers and misspecification.

ITMay 6, 2022

UAV-aided RF Mapping for Sensing and Connectivity in Wireless Networks

David Gesbert, Omid Esrafilian, Junting Chen et al.

The use of unmanned aerial vehicles (UAV) as flying radio access network (RAN) nodes offers a promising complement to traditional fixed terrestrial deployments. More recently yet still in the context of wireless networks, drones have also been envisioned for use as radio frequency (RF) sensing and localization devices. In both cases, the advantage of using UAVs lies in their ability to navigate themselves freely in 3D and in a timely manner to locations of space where the obtained network throughput or sensing performance is optimal. In practice, the selection of a proper location or trajectory for the UAV very much depends on local terrain features, including the position of surrounding radio obstacles. Hence, the robot must be able to map the features of its radio environment as it performs its data communication or sensing services. The challenges related to this task, referred here as radio mapping, are discussed in this paper. Its promises related to efficient trajectory design for autonomous radio-aware UAVs are highlighted, along with algorithm solutions. The advantages induced by radio-mapping in terms of connectivity, sensing, and localization performance are illustrated.

LGJun 3, 2023

Model-aided Federated Reinforcement Learning for Multi-UAV Trajectory Planning in IoT Networks

Jichao Chen, Omid Esrafilian, Harald Bayerlein et al.

Deploying teams of unmanned aerial vehicles (UAVs) to harvest data from distributed Internet of Things (IoT) devices requires efficient trajectory planning and coordination algorithms. Multi-agent reinforcement learning (MARL) has emerged as a solution, but requires extensive and costly real-world training data. To tackle this challenge, we propose a novel model-aided federated MARL algorithm to coordinate multiple UAVs on a data harvesting mission with only limited knowledge about the environment. The proposed algorithm alternates between building an environment simulation model from real-world measurements, specifically learning the radio channel characteristics and estimating unknown IoT device positions, and federated QMIX training in the simulated environment. Each UAV agent trains a local QMIX model in its simulated environment and continuously consolidates it through federated learning with other agents, accelerating the learning process. A performance comparison with standard MARL algorithms demonstrates that our proposed model-aided FedQMIX algorithm reduces the need for real-world training experiences by around three magnitudes while attaining similar data collection performance.

LGMay 31, 2022

Communication-Efficient Distributionally Robust Decentralized Learning

Matteo Zecchin, Marios Kountouris, David Gesbert

Decentralized learning algorithms empower interconnected devices to share data and computational resources to collaboratively train a machine learning model without the aid of a central coordinator. In the case of heterogeneous data distributions at the network nodes, collaboration can yield predictors with unsatisfactory performance for a subset of the devices. For this reason, in this work, we consider the formulation of a distributionally robust decentralized learning task and we propose a decentralized single loop gradient descent/ascent algorithm (AD-GDA) to directly solve the underlying minimax optimization problem. We render our algorithm communication-efficient by employing a compressed consensus scheme and we provide convergence guarantees for smooth convex and non-convex loss functions. Finally, we corroborate the theoretical findings with empirical results that highlight AD-GDA's ability to provide unbiased predictors and to greatly improve communication efficiency compared to existing distributionally robust algorithms.

LGApr 25, 2023

User-Centric Federated Learning: Trading off Wireless Resources for Personalization

Mohamad Mestoukirdi, Matteo Zecchin, David Gesbert et al.

Statistical heterogeneity across clients in a Federated Learning (FL) system increases the algorithm convergence time and reduces the generalization performance, resulting in a large communication overhead in return for a poor model. To tackle the above problems without violating the privacy constraints that FL imposes, personalized FL methods have to couple statistically similar clients without directly accessing their data in order to guarantee a privacy-preserving transfer. In this work, we design user-centric aggregation rules at the parameter server (PS) that are based on readily available gradient information and are capable of producing personalized models for each FL client. The proposed aggregation rules are inspired by an upper bound of the weighted aggregate empirical risk minimizer. Secondly, we derive a communication-efficient variant based on user clustering which greatly enhances its applicability to communication-constrained systems. Our algorithm outperforms popular personalized FL baselines in terms of average accuracy, worst node performance, and training communication overhead.

LGMar 3, 2022

Robust PAC$^m$: Training Ensemble Models Under Misspecification and Outliers

Matteo Zecchin, Sangwoo Park, Osvaldo Simeone et al.

Standard Bayesian learning is known to have suboptimal generalization capabilities under misspecification and in the presence of outliers. PAC-Bayes theory demonstrates that the free energy criterion minimized by Bayesian learning is a bound on the generalization error for Gibbs predictors (i.e., for single models drawn at random from the posterior) under the assumption of sampling distributions uncontaminated by outliers. This viewpoint provides a justification for the limitations of Bayesian learning when the model is misspecified, requiring ensembling, and when data is affected by outliers. In recent work, PAC-Bayes bounds -- referred to as PAC$^m$ -- were derived to introduce free energy metrics that account for the performance of ensemble predictors, obtaining enhanced performance under misspecification. This work presents a novel robust free energy criterion that combines the generalized logarithm score function with PAC$^m$ ensemble bounds. The proposed free energy training criterion produces predictive distributions that are able to concurrently counteract the detrimental effects of misspecification -- with respect to both likelihood and prior distribution -- and outliers.

ITMay 6, 2022

UAV-aided Wireless Node Localization Using Hybrid Radio Channel Models

Omid Esrafilian, Rajeev Gangula, David Gesbert

This paper considers the problem of ground user localization based on received signal strength (RSS) measurements obtained by an unmanned aerial vehicle (UAV). We treat UAV-user link channel model parameters and antenna radiation pattern of the UAV as unknowns that need to be estimated. A hybrid channel model is proposed that consists of a traditional path loss model combined with a neural network approximating the UAV antenna gain function. With this model and a set of offline RSS measurements, the unknown parameters are estimated. We then employ the particle swarm optimization (PSO) technique which utilizes the learned hybrid channel model along with a 3D map of the environment to accurately localize the ground users. The performance of the developed algorithm is evaluated through simulations and also real-world experiments.

ITJun 4, 2022

UAV-Aided Multi-Community Federated Learning

Mohamad Mestoukirdi, Omid Esrafilian, David Gesbert et al.

In this work, we investigate the problem of an online trajectory design for an Unmanned Aerial Vehicle (UAV) in a Federated Learning (FL) setting where several different communities exist, each defined by a unique task to be learned. In this setting, spatially distributed devices belonging to each community collaboratively contribute towards training their community model via wireless links provided by the UAV. Accordingly, the UAV acts as a mobile orchestrator coordinating the transmissions and the learning schedule among the devices in each community, intending to accelerate the learning process of all tasks. We propose a heuristic metric as a proxy for the training performance of the different tasks. Capitalizing on this metric, a surrogate objective is defined which enables us to jointly optimize the UAV trajectory and the scheduling of the devices by employing convex optimization techniques and graph theory. The simulations illustrate the out-performance of our solution when compared to other handpicked static and mobile UAV deployment baselines.

ITMar 2, 2022

UAV-Aided Decentralized Learning over Mesh Networks

Matteo Zecchin, David Gesbert, Marios Kountouris

Decentralized learning empowers wireless network devices to collaboratively train a machine learning (ML) model relying solely on device-to-device (D2D) communication. It is known that the convergence speed of decentralized optimization algorithms severely depends on the degree of the network connectivity, with denser network topologies leading to shorter convergence time. Consequently, the local connectivity of real world mesh networks, due to the limited communication range of its wireless nodes, undermines the efficiency of decentralized learning protocols, rendering them potentially impracticable. In this work we investigate the role of an unmanned aerial vehicle (UAV), used as flying relay, in facilitating decentralized learning procedures in such challenging conditions. We propose an optimized UAV trajectory, that is defined as a sequence of waypoints that the UAV visits sequentially in order to transfer intelligence across sparsely connected group of users. We then provide a series of experiments highlighting the essential role of UAVs in the context of decentralized learning over mesh networks.

LGSep 19, 2023

Communication-Efficient Federated Learning via Regularized Sparse Random Networks

Mohamad Mestoukirdi, Omid Esrafilian, David Gesbert et al.

This work presents a new method for enhancing communication efficiency in stochastic Federated Learning that trains over-parameterized random networks. In this setting, a binary mask is optimized instead of the model weights, which are kept fixed. The mask characterizes a sparse sub-network that is able to generalize as good as a smaller target network. Importantly, sparse binary masks are exchanged rather than the floating point weights in traditional federated learning, reducing communication cost to at most 1 bit per parameter (Bpp). We show that previous state of the art stochastic methods fail to find sparse networks that can reduce the communication and storage overhead using consistent loss objectives. To address this, we propose adding a regularization term to local objectives that acts as a proxy of the transmitted masks entropy, therefore encouraging sparser solutions by eliminating redundant features across sub-networks. Extensive empirical experiments demonstrate significant improvements in communication and memory efficiency of up to five magnitudes compared to the literature, with minimal performance degradation in validation accuracy in some instances

ITMay 7, 2024

Global Scale Self-Supervised Channel Charting with Sensor Fusion

Omid Esrafilian, Mohsen Ahadi, Florian Kaltenberger et al.

The sensing and positioning capabilities foreseen in 6G have great potential for technology advancements in various domains, such as future smart cities and industrial use cases. Channel charting has emerged as a promising technology in recent years for radio frequency-based sensing and localization. However, the accuracy of these techniques is yet far behind the numbers envisioned in 6G. To reduce this gap, in this paper, we propose a novel channel charting technique capitalizing on the time of arrival measurements from surrounding Transmission Reception Points (TRPs) along with their locations and leveraging sensor fusion in channel charting by incorporating laser scanner data during the training phase of our algorithm. The proposed algorithm remains self-supervised during training and test phases, requiring no geometrical models or user position ground truth. Simulation results validate the achievement of a sub-meter level localization accuracy using our algorithm 90% of the time, outperforming the state-of-the-art channel charting techniques and the traditional triangulation-based approaches.

SPMay 12, 2023

Revisiting Matching Pursuit: Beyond Approximate Submodularity

Ehsan Tohidi, Mario Coutino, David Gesbert

We study the problem of selecting a subset of vectors from a large set, to obtain the best signal representation over a family of functions. Although greedy methods have been widely used for tackling this problem and many of those have been analyzed under the lens of (weak) submodularity, none of these algorithms are explicitly devised using such a functional property. Here, we revisit the vector-selection problem and introduce a function which is shown to be submodular in expectation. This function does not only guarantee near-optimality through a greedy algorithm in expectation, but also alleviates the existing deficiencies in commonly used matching pursuit (MP) algorithms. We further show the relation between the single-point-estimate version of the proposed greedy algorithm and MP variants. Our theoretical results are supported by numerical experiments for the angle of arrival estimation problem, a typical signal representation task; the experiments demonstrate the benefits of the proposed method with respect to the traditional MP algorithms.

LGOct 19, 2021

User-Centric Federated Learning

Mohamad Mestoukirdi, Matteo Zecchin, David Gesbert et al.

Data heterogeneity across participating devices poses one of the main challenges in federated learning as it has been shown to greatly hamper its convergence time and generalization capabilities. In this work, we address this limitation by enabling personalization using multiple user-centric aggregation rules at the parameter server. Our approach potentially produces a personalized model for each user at the cost of some extra downlink communication overhead. To strike a trade-off between personalization and communication efficiency, we propose a broadcast protocol that limits the number of personalized streams while retaining the essential advantages of our learning scheme. Through simulation results, our approach is shown to enjoy higher personalization capabilities, faster convergence, and better communication efficiency compared to other competing baseline solutions.

ROSep 30, 2021

Modeling Interactions of Autonomous Vehicles and Pedestrians with Deep Multi-Agent Reinforcement Learning for Collision Avoidance

Raphael Trumpp, Harald Bayerlein, David Gesbert

Reliable pedestrian crash avoidance mitigation (PCAM) systems are crucial components of safe autonomous vehicles (AVs). The nature of the vehicle-pedestrian interaction where decisions of one agent directly affect the other agent's optimal behavior, and vice versa, is a challenging yet often neglected aspect of such systems. We address this issue by modeling a Markov decision process (MDP) for a simulated AV-pedestrian interaction at an unmarked crosswalk. The AV's PCAM decision policy is learned through deep reinforcement learning (DRL). Since modeling pedestrians realistically is challenging, we compare two levels of intelligent pedestrian behavior. While the baseline model follows a predefined strategy, our advanced pedestrian model is defined as a second DRL agent. This model captures continuous learning and the uncertainty inherent in human behavior, making the AV-pedestrian interaction a deep multi-agent reinforcement learning (DMARL) problem. We benchmark the developed PCAM systems according to the collision rate and the resulting traffic flow efficiency with a focus on the influence of observation uncertainty on the decision-making of the agents. The results show that the AV is able to completely mitigate collisions under the majority of the investigated conditions and that the DRL pedestrian model learns an intelligent crossing behavior.

HCAug 2, 2021

Leveraging Multiple Legacy Wi-Fi Links for Human Behavior Sensing

Lingchao Guo, Zhaoming Lu, Xiangming Wen et al.

Taking advantage of the rich information provided by Wi-Fi measurement setups, Wi-Fi-based human behavior sensing leveraging Channel State Information (CSI) measurements has received a lot of research attention in recent years. The CSI-based human sensing algorithms typically either rely on an explicit channel propagation model or, more recently, adopt machine learning so as to robustify feature extraction. In most related work, the considered CSI is extracted from a single dedicated Access Point (AP) communication setup. In this paper, we consider a more realistic setting where a legacy network of multiple APs is already deployed for communications purposes and leveraged for sensing benefits using machine learning. The use of legacy network presents challenges and opportunities as many Wi-Fi links can present with richer yet unequally useful data sets. In order to break the curse of dimensionality associated with training over a too large dimensional CSI, we propose a link selection mechanism based on Reinforcement Learning (RL) which allows for dimension reduction while preserving the data that is most relevant for human behavior sensing. The method is based on a sequential state decision-making process in which the CSI is modeled as a part of the state. From actual experiment results, our method is shown to perform better than state-of-the-art approaches in a scenario with multiple available Wi-Fi links.

ITApr 29, 2021

LIDAR and Position-Aided mmWave Beam Selection with Non-local CNNs and Curriculum Training

Matteo Zecchin, Mahdi Boloursaz Mashhadi, Mikolaj Jankowski et al.

Efficient millimeter wave (mmWave) beam selection in vehicle-to-infrastructure (V2I) communication is a crucial yet challenging task due to the narrow mmWave beamwidth and high user mobility. To reduce the search overhead of iterative beam discovery procedures, contextual information from light detection and ranging (LIDAR) sensors mounted on vehicles has been leveraged by data-driven methods to produce useful side information. In this paper, we propose a lightweight neural network (NN) architecture along with the corresponding LIDAR preprocessing, which significantly outperforms previous works. Our solution comprises multiple novelties that improve both the convergence speed and the final accuracy of the model. In particular, we define a novel loss function inspired by the knowledge distillation idea, introduce a curriculum training approach exploiting line-of-sight (LOS)/non-line-of-sight (NLOS) information, and we propose a non-local attention module to improve the performance for the more challenging NLOS cases. Simulation results on benchmark datasets show that, utilizing solely LIDAR data and the receiver position, our NN-based beam selection scheme can achieve 79.9% throughput of an exhaustive beam sweeping approach without any beam search overhead and 95% by searching among as few as 6 beams. In a typical mmWave V2I scenario, our proposed method considerably reduces the beam search time required to achieve a desired throughput, in comparison with the inverse fingerprinting and hierarchical beam selection schemes.

ITApr 21, 2021

Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks

Omid Esrafilian, Harald Bayerlein, David Gesbert

Deep Reinforcement Learning (DRL) is gaining attention as a potential approach to design trajectories for autonomous unmanned aerial vehicles (UAV) used as flying access points in the context of cellular or Internet of Things (IoT) connectivity. DRL solutions offer the advantage of on-the-go learning hence relying on very little prior contextual information. A corresponding drawback however lies in the need for many learning episodes which severely restricts the applicability of such approach in real-world time- and energy-constrained missions. Here, we propose a model-aided deep Q-learning approach that, in contrast to previous work, considerably reduces the need for extensive training data samples, while still achieving the overarching goal of DRL, i.e to guide a battery-limited UAV on an efficient data harvesting trajectory, without prior knowledge of wireless channel characteristics and limited knowledge of wireless node locations. The key idea consists in using a small subset of nodes as anchors (i.e. with known location) and learning a model of the propagation environment while implicitly estimating the positions of regular nodes. Interaction with the model allows us to train a deep Q-network (DQN) to approximate the optimal UAV control policy. We show that in comparison with standard DRL approaches, the proposed model-aided approach requires at least one order of magnitude less training data samples to reach identical data collection performance, hence offering a first step towards making DRL a viable solution to the problem.

MAOct 23, 2020

Multi-UAV Path Planning for Wireless Data Harvesting with Deep Reinforcement Learning

Harald Bayerlein, Mirco Theile, Marco Caccamo et al.

Harvesting data from distributed Internet of Things (IoT) devices with multiple autonomous unmanned aerial vehicles (UAVs) is a challenging problem requiring flexible path planning methods. We propose a multi-agent reinforcement learning (MARL) approach that, in contrast to previous work, can adapt to profound changes in the scenario parameters defining the data harvesting mission, such as the number of deployed UAVs, number, position and data amount of IoT devices, or the maximum flying time, without the need to perform expensive recomputations or relearn control policies. We formulate the path planning problem for a cooperative, non-communicating, and homogeneous team of UAVs tasked with maximizing collected data from distributed IoT sensor nodes subject to flying time and collision avoidance constraints. The path planning problem is translated into a decentralized partially observable Markov decision process (Dec-POMDP), which we solve through a deep reinforcement learning (DRL) approach, approximating the optimal UAV control policy without prior knowledge of the challenging wireless channel characteristics in dense urban environments. By exploiting a combination of centered global and local map representations of the environment that are fed into convolutional layers of the agents, we show that our proposed network architecture enables the agents to cooperate effectively by carefully dividing the data collection task among themselves, adapt to large complex environments and state spaces, and make movement decisions that balance data collection goals, flight-time efficiency, and navigation constraints. Finally, learning a control policy that generalizes over the scenario parameter space enables us to analyze the influence of individual parameters on collection performance and provide some intuition about system-level benefits.

ROOct 14, 2020

UAV Path Planning using Global and Local Map Information with Deep Reinforcement Learning

Mirco Theile, Harald Bayerlein, Richard Nai et al.

Path planning methods for autonomous unmanned aerial vehicles (UAVs) are typically designed for one specific type of mission. This work presents a method for autonomous UAV path planning based on deep reinforcement learning (DRL) that can be applied to a wide range of mission scenarios. Specifically, we compare coverage path planning (CPP), where the UAV's goal is to survey an area of interest to data harvesting (DH), where the UAV collects data from distributed Internet of Things (IoT) sensor devices. By exploiting structured map information of the environment, we train double deep Q-networks (DDQNs) with identical architectures on both distinctly different mission scenarios to make movement decisions that balance the respective mission goal with navigation constraints. By introducing a novel approach exploiting a compressed global map of the environment combined with a cropped but uncompressed local map showing the vicinity of the UAV agent, we demonstrate that the proposed method can efficiently scale to large environments. We also extend previous results for generalizing control policies that require no retraining when scenario parameters change and offer a detailed analysis of crucial map processing parameters' effects on path planning performance.

ITJul 28, 2020

Team Deep Mixture of Experts for Distributed Power Control

Matteo Zecchin, David Gesbert, Marios Kountouris

In the context of wireless networking, it was recently shown that multiple DNNs can be jointly trained to offer a desired collaborative behaviour capable of coping with a broad range of sensing uncertainties. In particular, it was established that DNNs can be used to derive policies that are robust with respect to the information noise statistic affecting the local information (e.g. CSI in a wireless network) used by each agent (e.g. transmitter) to make its decision. While promising, a major challenge in the implementation of such method is that information noise statistics may differ from agent to agent and, more importantly, that such statistics may not be available at the time of training or may evolve over time, making burdensome retraining necessary. This situation makes it desirable to devise a "universal" machine learning model, which can be trained once for all so as to allow for decentralized cooperation in any future feedback noise environment. With this goal in mind, we propose an architecture inspired from the well-known Mixture of Experts (MoE) model, which was previously used for non-linear regression and classification tasks in various contexts, such as computer vision and speech recognition. We consider the decentralized power control problem as an example to showcase the validity of the proposed model and to compare it against other power control algorithms. We show the ability of the so called Team-DMoE model to efficiently track time-varying statistical scenarios.

LGJul 1, 2020

UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach

Harald Bayerlein, Mirco Theile, Marco Caccamo et al.

Autonomous deployment of unmanned aerial vehicles (UAVs) supporting next-generation communication networks requires efficient trajectory planning methods. We propose a new end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. While previous approaches, learning and non-learning based, must perform expensive recomputations or relearn a behavior when important scenario parameters such as the number of sensors, sensor positions, or maximum flying time, change, we train a double deep Q-network (DDQN) with combined experience replay to learn a UAV control policy that generalizes over changing scenario parameters. By exploiting a multi-layer map of the environment fed through convolutional network layers to the agent, we show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters that balance the data collection goal with flight time efficiency and safety constraints. Considerable advantages in learning efficiency from using a map centered on the UAV's position over a non-centered map are also illustrated.

ROMar 5, 2020

UAV Coverage Path Planning under Varying Power Constraints using Deep Reinforcement Learning

Mirco Theile, Harald Bayerlein, Richard Nai et al.

Coverage path planning (CPP) is the task of designing a trajectory that enables a mobile agent to travel over every point of an area of interest. We propose a new method to control an unmanned aerial vehicle (UAV) carrying a camera on a CPP mission with random start positions and multiple options for landing positions in an environment containing no-fly zones. While numerous approaches have been proposed to solve similar CPP problems, we leverage end-to-end reinforcement learning (RL) to learn a control policy that generalizes over varying power constraints for the UAV. Despite recent improvements in battery technology, the maximum flying range of small UAVs is still a severe constraint, which is exacerbated by variations in the UAV's power consumption that are hard to predict. By using map-like input channels to feed spatial information through convolutional network layers to the agent, we are able to train a double deep Q-network (DDQN) to make control decisions for the UAV, balancing limited power budget and coverage goal. The proposed method can be applied to a wide variety of environments and harmonizes complex goal structures with system constraints.

ITApr 28, 2019

Machine Learning in the Air

Deniz Gunduz, Paul de Kerret, Nicholas D. Sidiropoulos et al.

Thanks to the recent advances in processing speed and data acquisition and storage, machine learning (ML) is penetrating every facet of our lives, and transforming research in many areas in a fundamental manner. Wireless communications is another success story -- ubiquitous in our lives, from handheld devices to wearables, smart homes, and automobiles. While recent years have seen a flurry of research activity in exploiting ML tools for various wireless communication problems, the impact of these techniques in practical communication systems and standards is yet to be seen. In this paper, we review some of the major promises and challenges of ML in wireless communication systems, focusing mainly on the physical layer. We present some of the most striking recent accomplishments that ML techniques have achieved with respect to classical approaches, and point to promising research directions where ML is likely to make the biggest impact in the near future. We also highlight the complementary problem of designing physical layer techniques to enable distributed ML at the wireless network edge, which further emphasizes the need to understand and connect ML with fundamental concepts in wireless communications.