LGJun 17, 2022
FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated LearningAnis Elgabli, Chaouki Ben Issaid, Amrit S. Bedi et al.
Newton-type methods are popular in federated learning due to their fast convergence. Still, they suffer from two main issues, namely: low communication efficiency and low privacy due to the requirement of sending Hessian information from clients to parameter server (PS). In this work, we introduced a novel framework called FedNew in which there is no need to transmit Hessian information from clients to PS, hence resolving the bottleneck to improve communication efficiency. In addition, FedNew hides the gradient information and results in a privacy-preserving approach compared to the existing state-of-the-art. The core novel idea in FedNew is to introduce a two level framework, and alternate between updating the inverse Hessian-gradient product using only one alternating direction method of multipliers (ADMM) step and then performing the global model update using Newton's method. Though only one ADMM pass is used to approximate the inverse Hessian-gradient product at each iteration, we develop a novel theoretical approach to show the converging behavior of FedNew for convex problems. Additionally, a significant reduction in communication overhead is achieved by utilizing stochastic quantization. Numerical results using real datasets show the superiority of FedNew compared to existing methods in terms of communication costs.
LGAug 29, 2022
DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over GraphsChaouki Ben Issaid, Anis Elgabli, Mehdi Bennis
In this paper, we propose to solve a regularized distributionally robust learning problem in the decentralized setting, taking into account the data distribution shift. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust minimization problem and solved efficiently. Leveraging the newly formulated optimization problem, we propose a robust version of Decentralized Stochastic Gradient Descent (DSGD), coined Distributionally Robust Decentralized Stochastic Gradient Descent (DR-DSGD). Under some mild assumptions and provided that the regularization parameter is larger than one, we theoretically prove that DR-DSGD achieves a convergence rate of $\mathcal{O}\left(1/\sqrt{KT} + K/T\right)$, where $K$ is the number of devices and $T$ is the number of iterations. Simulation results show that our proposed algorithm can improve the worst distribution test accuracy by up to $10\%$. Moreover, DR-DSGD is more communication-efficient than DSGD since it requires fewer communication rounds (up to $20$ times less) to achieve the same worst distribution test accuracy target. Furthermore, the conducted experiments reveal that DR-DSGD results in a fairer performance across devices in terms of test accuracy.
CRJun 21, 2025
AdRo-FL: Informed and Secure Client Selection for Federated Learning in the Presence of Adversarial AggregatorMd. Kamrul Hossain, Walid Aljoby, Anis Elgabli et al.
Federated Learning (FL) enables collaborative learning without exposing clients' data. While clients only share model updates with the aggregator, studies reveal that aggregators can infer sensitive information from these updates. Secure Aggregation (SA) protects individual updates during transmission; however, recent work demonstrates a critical vulnerability where adversarial aggregators manipulate client selection to bypass SA protections, constituting a Biased Selection Attack (BSA). Although verifiable random selection prevents BSA, it precludes informed client selection essential for FL performance. We propose Adversarial Robust Federated Learning (AdRo-FL), which simultaneously enables: informed client selection based on client utility, and robust defense against BSA maintaining privacy-preserving aggregation. AdRo-FL implements two client selection frameworks tailored for distinct settings. The first framework assumes clients are grouped into clusters based on mutual trust, such as different branches of an organization. The second framework handles distributed clients where no trust relationships exist between them. For the cluster-oriented setting, we propose a novel defense against BSA by (1) enforcing a minimum client selection quota from each cluster, supervised by a cluster-head in every round, and (2) introducing a client utility function to prioritize efficient clients. For the distributed setting, we design a two-phase selection protocol: first, the aggregator selects the top clients based on our utility-driven ranking; then, a verifiable random function (VRF) ensures a BSA-resistant final selection. AdRo-FL also applies quantization to reduce communication overhead and sets strict transmission deadlines to improve energy efficiency. AdRo-FL achieves up to $1.85\times$ faster time-to-accuracy and up to $1.06\times$ higher final accuracy compared to insecure baselines.
LGDec 22, 2023
Balancing Energy Efficiency and Distributional Robustness in Over-the-Air Federated LearningMohamed Badi, Chaouki Ben Issaid, Anis Elgabli et al.
The growing number of wireless edge devices has magnified challenges concerning energy, bandwidth, latency, and data heterogeneity. These challenges have become bottlenecks for distributed learning. To address these issues, this paper presents a novel approach that ensures energy efficiency for distributionally robust federated learning (FL) with over air computation (AirComp). In this context, to effectively balance robustness with energy efficiency, we introduce a novel client selection method that integrates two complementary insights: a deterministic one that is designed for energy efficiency, and a probabilistic one designed for distributional robustness. Simulation results underscore the efficacy of the proposed algorithm, revealing its superior performance compared to baselines from both robustness and energy efficiency perspectives, achieving more than 3-fold energy savings compared to the considered baselines.
LGJun 2, 2021
Communication-Efficient Split Learning Based on Analog Communication and Over the Air AggregationMounssif Krouka, Anis Elgabli, Chaouki ben Issaid et al.
Split-learning (SL) has recently gained popularity due to its inherent privacy-preserving capabilities and ability to enable collaborative inference for devices with limited computational power. Standard SL algorithms assume an ideal underlying digital communication system and ignore the problem of scarce communication bandwidth. However, for a large number of agents, limited bandwidth resources, and time-varying communication channels, the communication bandwidth can become the bottleneck. To address this challenge, in this work, we propose a novel SL framework to solve the remote inference problem that introduces an additional layer at the agent side and constrains the choices of the weights and the biases to ensure over the air aggregation. Hence, the proposed approach maintains constant communication cost with respect to the number of agents enabling remote inference under limited bandwidth. Numerical results show that our proposed algorithm significantly outperforms the digital implementation in terms of communication-efficiency, especially as the number of agents grows large.
LGJun 2, 2021
Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying ChannelsMounssif Krouka, Anis Elgabli, Chaouki Ben Issaid et al.
Today's intelligent applications can achieve high performance accuracy using machine learning (ML) techniques, such as deep neural networks (DNNs). Traditionally, in a remote DNN inference problem, an edge device transmits raw data to a remote node that performs the inference task. However, this may incur high transmission energy costs and puts data privacy at risk. In this paper, we propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes. The time-varying representation accounts for time-varying channels and can significantly reduce the total energy at the edge device while maintaining high accuracy (low loss). We implement our approach in an image classification task using the MNIST dataset, and the system environment is simulated as a trajectory navigation scenario to emulate different channel conditions. Numerical simulations show that our proposed solution results in minimal energy consumption and $CO_2$ emission compared to the considered baselines while exhibiting robust performance across different channel conditions and bandwidth regime choices.
LGMay 31, 2021
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient AscentAnis Elgabli, Chaouki Ben Issaid, Amrit S. Bedi et al.
In this paper, we propose an energy-efficient federated meta-learning framework. The objective is to enable learning a meta-model that can be fine-tuned to a new task with a few number of samples in a distributed setting and at low computation and communication energy consumption. We assume that each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model. Assuming each task was trained offline on the agent's local data, we propose a lightweight algorithm that starts from the local models of all agents, and in a backward manner using projected stochastic gradient ascent (P-SGA) finds a meta-model. The proposed method avoids complex computations such as computing hessian, double looping, and matrix inversion, while achieving high performance at significantly less energy consumption compared to the state-of-the-art methods such as MAML and iMAML on conducted experiments for sinusoid regression and image classification tasks.
LGNov 12, 2020
Cross Layer Optimization and Distributed Reinforcement Learning for Wireless 360° Video StreamingAnis Elgabli, Mohammed S. Elbamby, Cristina Perfecto et al.
Wirelessly streaming high quality 360 degree videos is still a challenging problem. When there are many users watching different 360 degree videos and competing for the computing and communication resources, the streaming algorithm at hand should maximize the average quality of experience (QoE) while guaranteeing a minimum rate for each user. In this paper, we propose a cross layer optimization approach that maximizes the available rate to each user and efficiently uses it to maximize users' QoE. Particularly, we consider a tile based 360 degree video streaming, and we optimize a QoE metric that balances the tradeoff between maximizing each user's QoE and ensuring fairness among users. We show that the problem can be decoupled into two interrelated subproblems: (i) a physical layer subproblem whose objective is to find the download rate for each user, and (ii) an application layer subproblem whose objective is to use that rate to find a quality decision per tile such that the user's QoE is maximized. We prove that the physical layer subproblem can be solved optimally with low complexity and an actor-critic deep reinforcement learning (DRL) is proposed to leverage the parallel training of multiple independent agents and solve the application layer subproblem. Extensive experiments reveal the robustness of our scheme and demonstrate its significant performance improvement compared to several baseline algorithms.
LGNov 9, 2020
BayGo: Joint Bayesian Learning and Information-Aware Graph OptimizationTamara Alshammari, Sumudu Samarakoon, Anis Elgabli et al.
This article deals with the problem of distributed machine learning, in which agents update their models based on their local datasets, and aggregate the updated models collaboratively and in a fully decentralized manner. In this paper, we tackle the problem of information heterogeneity arising in multi-agent networks where the placement of informative agents plays a crucial role in the learning dynamics. Specifically, we propose BayGo, a novel fully decentralized joint Bayesian learning and graph optimization framework with proven fast convergence over a sparse graph. Under our framework, agents are able to learn and communicate with the most informative agent to their own learning. Unlike prior works, our framework assumes no prior knowledge of the data distribution across agents nor does it assume any knowledge of the true parameter of the system. The proposed alternating minimization based framework ensures global connectivity in a fully decentralized way while minimizing the number of communication links. We theoretically show that by optimizing the proposed objective function, the estimation error of the posterior probability distribution decreases exponentially at each iteration. Via extensive simulations, we show that our framework achieves faster convergence and higher accuracy compared to fully-connected and star topology graphs.
LGSep 14, 2020
Communication Efficient Distributed Learning with Censored, Quantized, and Generalized Group ADMMChaouki Ben Issaid, Anis Elgabli, Jihong Park et al.
In this paper, we propose a communication-efficiently decentralized machine learning framework that solves a consensus optimization problem defined over a network of inter-connected workers. The proposed algorithm, Censored and Quantized Generalized GADMM (CQ-GGADMM), leverages the worker grouping and decentralized learning ideas of Group Alternating Direction Method of Multipliers (GADMM), and pushes the frontier in communication efficiency by extending its applicability to generalized network topologies, while incorporating link censoring for negligible updates after quantization. We theoretically prove that CQ-GGADMM achieves the linear convergence rate when the local objective functions are strongly convex under some mild assumptions. Numerical simulations corroborate that CQ-GGADMM exhibits higher communication efficiency in terms of the number of communication rounds and transmit energy consumption without compromising the accuracy and convergence speed, compared to the censored decentralized ADMM, and the worker grouping method of GADMM.
LGAug 6, 2020
Communication-Efficient and Distributed Learning Over Wireless Networks: Principles and ApplicationsJihong Park, Sumudu Samarakoon, Anis Elgabli et al.
Machine learning (ML) is a promising enabler for the fifth generation (5G) communication systems and beyond. By imbuing intelligence into the network edge, edge nodes can proactively carry out decision-making, and thereby react to local environmental changes and disturbances while experiencing zero communication latency. To achieve this goal, it is essential to cater for high ML inference accuracy at scale under time-varying channel and network dynamics, by continuously exchanging fresh data and ML model updates in a distributed way. Taming this new kind of data traffic boils down to improving the communication efficiency of distributed learning by optimizing communication payload types, transmission techniques, and scheduling, as well as ML architectures, algorithms, and data processing methods. To this end, this article aims to provide a holistic overview of relevant communication and ML principles, and thereby present communication-efficient and distributed learning frameworks with selected use cases.
LGJul 3, 2020
Harnessing Wireless Channels for Scalable and Privacy-Preserving Federated LearningAnis Elgabli, Jihong Park, Chaouki Ben Issaid et al.
Wireless connectivity is instrumental in enabling scalable federated learning (FL), yet wireless channels bring challenges for model training, in which channel randomness perturbs each worker's model update while multiple workers' updates incur significant interference under limited bandwidth. To address these challenges, in this work we formulate a novel constrained optimization problem, and propose an FL framework harnessing wireless channel perturbations and interference for improving privacy, bandwidth-efficiency, and scalability. The resultant algorithm is coined analog federated ADMM (A-FADMM) based on analog transmissions and the alternating direction method of multipliers (ADMM). In A-FADMM, all workers upload their model updates to the parameter server (PS) using a single channel via analog transmissions, during which all models are perturbed and aggregated over-the-air. This not only saves communication bandwidth, but also hides each worker's exact model update trajectory from any eavesdropper including the honest-but-curious PS, thereby preserving data privacy against model inversion attacks. We formally prove the convergence and privacy guarantees of A-FADMM for convex functions under time-varying channels, and numerically show the effectiveness of A-FADMM under noisy channels and stochastic non-convex functions, in terms of convergence speed and scalability, as well as communication bandwidth and energy efficiency.
NIJan 22, 2020
Reinforcement Learning Based Vehicle-cell Association Algorithm for Highly Mobile Millimeter Wave CommunicationHamza Khan, Anis Elgabli, Sumudu Samarakoon et al.
Vehicle-to-everything (V2X) communication is a growing area of communication with a variety of use cases. This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks. The aim is to maximize the time average rate per vehicular user (VUE) while ensuring a target minimum rate for all VUEs with low signaling overhead. We first formulate the user (vehicle) association problem as a discrete non-convex optimization problem. Then, by leveraging tools from machine learning, specifically distributed deep reinforcement learning (DDRL) and the asynchronous actor critic algorithm (A3C), we propose a low complexity algorithm that approximates the solution of the proposed optimization problem. The proposed DDRL-based algorithm endows every road side unit (RSU) with a local RL agent that selects a local action based on the observed input state. Actions of different RSUs are forwarded to a central entity, that computes a global reward which is then fed back to RSUs. It is shown that each independently trained RL performs the vehicle-RSU association action with low control overhead and less computational complexity compared to running an online complex algorithm to solve the non-convex optimization problem. Finally, simulation results show that the proposed solution achieves up to 15\% gains in terms of sum rate and 20\% reduction in VUE outages compared to several baseline designs.
LGNov 9, 2019
L-FGADMM: Layer-Wise Federated Group ADMM for Communication Efficient Decentralized Deep LearningAnis Elgabli, Jihong Park, Sabbir Ahmed et al.
This article proposes a communication-efficient decentralized deep learning algorithm, coined layer-wise federated group ADMM (L-FGADMM). To minimize an empirical risk, every worker in L-FGADMM periodically communicates with two neighbors, in which the periods are separately adjusted for different layers of its deep neural network. A constrained optimization problem for this setting is formulated and solved using the stochastic version of GADMM proposed in our prior work. Numerical evaluations show that by less frequently exchanging the largest layer, L-FGADMM can significantly reduce the communication cost, without compromising the convergence speed. Surprisingly, despite less exchanged information and decentralized operations, intermittently skipping the largest layer consensus in L-FGADMM creates a regularizing effect, thereby achieving the test accuracy as high as federated learning (FL), a baseline method with the entire layer consensus by the aid of a central entity.
LGOct 23, 2019
Q-GADMM: Quantized Group ADMM for Communication Efficient Decentralized Machine LearningAnis Elgabli, Jihong Park, Amrit S. Bedi et al.
In this article, we propose a communication-efficient decentralized machine learning (ML) algorithm, coined quantized group ADMM (Q-GADMM). To reduce the number of communication links, every worker in Q-GADMM communicates only with two neighbors, while updating its model via the group alternating direction method of multipliers (GADMM). Moreover, each worker transmits the quantized difference between its current model and its previously quantized model, thereby decreasing the communication payload size. However, due to the lack of centralized entity in decentralized ML, the spatial sparsity and payload compression may incur error propagation, hindering model training convergence. To overcome this, we develop a novel stochastic quantization method to adaptively adjust model quantization levels and their probabilities, while proving the convergence of Q-GADMM for convex objective functions. Furthermore, to demonstrate the feasibility of Q-GADMM for non-convex and stochastic problems, we propose quantized stochastic GADMM (Q-SGADMM) that incorporates deep neural network architectures and stochastic sampling. Simulation results corroborate that Q-GADMM significantly outperforms GADMM in terms of communication efficiency while achieving the same accuracy and convergence speed for a linear regression task. Similarly, for an image classification task using DNN, Q-SGADMM achieves significantly less total communication cost with identical accuracy and convergence speed compared to its counterpart without quantization, i.e., stochastic GADMM (SGADMM).
LGAug 30, 2019
GADMM: Fast and Communication Efficient Framework for Distributed Machine LearningAnis Elgabli, Jihong Park, Amrit S. Bedi et al.
When the data is distributed across multiple servers, lowering the communication cost between the servers (or workers) while solving the distributed learning problem is an important problem and is the focus of this paper. In particular, we propose a fast, and communication-efficient decentralized framework to solve the distributed machine learning (DML) problem. The proposed algorithm, Group Alternating Direction Method of Multipliers (GADMM) is based on the Alternating Direction Method of Multipliers (ADMM) framework. The key novelty in GADMM is that it solves the problem in a decentralized topology where at most half of the workers are competing for the limited communication resources at any given time. Moreover, each worker exchanges the locally trained model only with two neighboring workers, thereby training a global model with a lower amount of communication overhead in each exchange. We prove that GADMM converges to the optimal solution for convex loss functions, and numerically show that it converges faster and more communication-efficient than the state-of-the-art communication-efficient algorithms such as the Lazily Aggregated Gradient (LAG) and dual averaging, in linear and logistic regression tasks on synthetic and real datasets. Furthermore, we propose Dynamic GADMM (D-GADMM), a variant of GADMM, and prove its convergence under the time-varying network topology of the workers.
ITAug 16, 2019
Distilling On-Device Intelligence at the Network EdgeJihong Park, Shiqiang Wang, Anis Elgabli et al.
Devices at the edge of wireless networks are the last mile data sources for machine learning (ML). As opposed to traditional ready-made public datasets, these user-generated private datasets reflect the freshest local environments in real time. They are thus indispensable for enabling mission-critical intelligent systems, ranging from fog radio access networks (RANs) to driverless cars and e-Health wearables. This article focuses on how to distill high-quality on-device ML models using fog computing, from such user-generated private data dispersed across wirelessly connected devices. To this end, we introduce communication-efficient and privacy-preserving distributed ML frameworks, termed fog ML (FML), wherein on-device ML models are trained by exchanging model parameters, model outputs, and surrogate data. We then present advanced FML frameworks addressing wireless RAN characteristics, limited on-device resources, and imbalanced data distributions. Our study suggests that the full potential of FML can be reached by co-designing communication and distributed ML operations while accounting for heterogeneous hardware specifications, data characteristics, and user requirements.
NISep 28, 2018
GroupCast: Preference-Aware Cooperative Video Streaming with Scalable Video CodingAnis Elgabli, Muhamad Felemban, Vaneet Aggarwal
In this paper, we propose a preference-aware cooperative video streaming system for videos encoded using Scalable Video Coding (SVC) where all the collaborating users are interested in watching a video together on a shared screen. However, each user's willingness to cooperate is subject to her own constraints such as user data plans and/or energy consumption. Using SVC, each layer of every chunk can be fetched through any of the cooperating users. We formulate the problem of finding the optimal quality decisions and fetching policy of the SVC layers of video chunks subject to the available bandwidth, chunk deadlines, and cooperation willingness of the different users as an optimization problem. The objective is to optimize a QoE metric that maintains a trade-off between maximizing the playback rate of every chunk while ensuring fairness among all chunks for the minimum skip/stall duration without violating any of the imposed constraints. We propose an offline algorithm to solve the non-convex optimization problem when the bandwidth prediction is non-causally known. This algorithm has a run-time complexity that is polynomial in the video length and the number of cooperating users. Furthermore, we propose an online version of the algorithm for more practical scenarios where erroneous bandwidth prediction for a short window is used. Real implementation with android devices using SVC encoded video on public bandwidth traces' dataset reveals the robustness and performance of the proposed algorithm and shows that the algorithm significantly outperforms round robin based mechanisms in terms of avoiding skips/stalls and fetching video chunks at their highest quality possible.
NIJun 7, 2018
FastScan: Robust Low-Complexity Rate Adaptation Algorithm for Video Streaming over HTTPAnis Elgabli, Vaneet Aggarwal
This paper proposes and evaluates a novel algorithm for streaming video over HTTP. The problem is formulated as a non-convex optimization problem which is constrained by the predicted available bandwidth, chunk deadlines, available video rates, and buffer occupancy. The objective is to optimize a QoE metric that maintains a tradeoff between maximizing the playback rate of every chunk and ensuring fairness among different chunks for the minimum re-buffering time. We propose FastScan, a low complexity algorithm that solves the problem. Online adaptations for dynamic bandwidth environments are proposed with imperfect available bandwidth prediction. Results of experiments driven by Variable Bit Rate (VBR) encoded video, video platform system (dash.js), and cellular bandwidth traces of a public dataset reveal the robustness of the online version of FastScan algorithm and demonstrate its significant performance improvement as compared to the considered state-of-the-art video streaming algorithms. For example, on an experiment conducted over 100 real cellular available bandwidth traces of a public dataset that spans different available bandwidth regimes, our proposed algorithm (FastScan) achieves the minimum re-buffering (stall) time and the maximum average playback rate in every single trace as compared to Bola, Festive, BBA, RB, and FastMPC, and Pensieve algorithms.
NIApr 30, 2018
LBP: Robust Rate Adaptation Algorithm for SVC Video StreamingAnis Elgabli, Vaneet Aggarwal, Shuai Hao et al.
Video streaming today accounts for up to 55\% of mobile traffic. In this paper, we explore streaming videos encoded using Scalable Video Coding scheme (SVC) over highly variable bandwidth conditions such as cellular networks. SVC's unique encoding scheme allows the quality of a video chunk to change incrementally, making it more flexible and adaptive to challenging network conditions compared to other encoding schemes. Our contribution is threefold. First, we formulate the quality decisions of video chunks constrained by the available bandwidth, the playback buffer, and the chunk deadlines as an optimization problem. The objective is to optimize a novel QoE metric that models a combination of the three objectives of minimizing the stall/skip duration of the video, maximizing the playback quality of every chunk, and minimizing the number of quality switches. Second, we develop Layered Bin Packing (LBP) Adaptation Algorithm, a novel algorithm that solves the proposed optimization problem. Moreover, we show that LBP achieves the optimal solution of the proposed optimization problem with linear complexity in the number of video chunks. Third, we propose an online algorithm (online LBP) where several challenges are addressed including handling bandwidth prediction errors, and short prediction duration. Extensive simulations with real bandwidth traces of public datasets reveal the robustness of our scheme and demonstrate its significant performance improvement as compared to the state-of-the-art SVC streaming algorithms. The proposed algorithm is also implemented on a TCP/IP emulation test bed with real LTE bandwidth traces, and the emulation confirms the simulation results and validates that the algorithm can be implemented and deployed on today's mobile devices.
NIJan 6, 2018
Optimized Preference-Aware Multi-path Video Streaming with Scalable Video CodingAnis Elgabli, Ke Liu, Vaneet Aggarwal
Most client hosts are equipped with multiple network interfaces (e.g., WiFi and cellular networks). Simultaneous access of multiple interfaces can significantly improve the users' quality of experience (QoE) in video streaming. An intuitive approach to achieve it is to use Multi-path TCP (MPTCP). However, the deployment of MPTCP, especially with link preference, requires OS kernel update at both the client and server side, and a vast amount of commercial content providers do not support MPTCP. Thus, in this paper, we realize a multi-path video streaming algorithm in the application layer instead, by considering Scalable Video Coding (SVC), where each layer of every chunk can be fetched from only one of the orthogonal paths. We formulate the quality decisions of video chunks subject to the available bandwidth of the different paths, chunk deadlines, and link preferences as an optimization problem. The objective is to to optimize a QoE metric that maintains a tradeoff between maximizing the playback rate of every chunk and ensuring fairness among chunks. The QoE is a weighted some of the following metrics: skip/stall duration, average playback rate, and quality switching rate. However, the weights are chosen such that pushing more chunks to the same quality level is more preferable over any other choice. Even though the formulation is a non-convex discrete optimization, we show that the problem can be solved optimally with a polynomial complexity in some special cases. We further propose an online algorithm where several challenges including bandwidth prediction errors, are addressed. Extensive emulated experiments in a real testbed with real traces of public dataset reveal the robustness of our scheme and demonstrate its significant performance improvement compared to other multi-path algorithms.