Anwar Walid

LG
h-index26
15papers
603citations
Novelty41%
AI Score40

15 Papers

LGFeb 4, 2023
Deep Reinforcement Learning for Traffic Light Control in Intelligent Transportation Systems

Ming Zhu, Xiao-Yang Liu, Sem Borst et al.

Smart traffic lights in intelligent transportation systems (ITSs) are envisioned to greatly increase traffic efficiency and reduce congestion. Deep reinforcement learning (DRL) is a promising approach to adaptively control traffic lights based on the real-time traffic situation in a road network. However, conventional methods may suffer from poor scalability. In this paper, we investigate deep reinforcement learning to control traffic lights, and both theoretical analysis and numerical experiments show that the intelligent behavior ``greenwave" (i.e., a vehicle will see a progressive cascade of green lights, and not have to brake at any intersection) emerges naturally a grid road network, which is proved to be the optimal policy in an avenue with multiple cross streets. As a first step, we use two DRL algorithms for the traffic light control problems in two scenarios. In a single road intersection, we verify that the deep Q-network (DQN) algorithm delivers a thresholding policy; and in a grid road network, we adopt the deep deterministic policy gradient (DDPG) algorithm. Secondly, numerical experiments show that the DQN algorithm delivers the optimal control, and the DDPG algorithm with passive observations has the capability to produce on its own a high-level intelligent behavior in a grid road network, namely, the ``greenwave" policy emerges. We also verify the ``greenwave" patterns in a $5 \times 10$ grid road network. Thirdly, the ``greenwave" patterns demonstrate that DRL algorithms produce favorable solutions since the ``greenwave" policy shown in experiment results is proved to be optimal in a specified traffic model (an avenue with multiple cross streets). The delivered policies both in a single road intersection and a grid road network demonstrate the scalability of DRL algorithms.

LGMar 15, 2022
Regenerative Particle Thompson Sampling

Zeyu Zhou, Bruce Hajek, Nakjung Choi et al.

This paper proposes regenerative particle Thompson sampling (RPTS), a flexible variation of Thompson sampling. Thompson sampling itself is a Bayesian heuristic for solving stochastic bandit problems, but it is hard to implement in practice due to the intractability of maintaining a continuous posterior distribution. Particle Thompson sampling (PTS) is an approximation of Thompson sampling obtained by simply replacing the continuous distribution by a discrete distribution supported at a set of weighted static particles. We observe that in PTS, the weights of all but a few fit particles converge to zero. RPTS is based on the heuristic: delete the decaying unfit particles and regenerate new particles in the vicinity of fit surviving particles. Empirical evidence shows uniform improvement from PTS to RPTS and flexibility and efficacy of RPTS across a set of representative bandit problems, including an application to 5G network slicing.

NINov 3, 2022
Fair and Efficient Distributed Edge Learning with Hybrid Multipath TCP

Shiva Raj Pokhrel, Jinho Choi, Anwar Walid

The bottleneck of distributed edge learning (DEL) over wireless has shifted from computing to communication, primarily the aggregation-averaging (Agg-Avg) process of DEL. The existing transmission control protocol (TCP)-based data networking schemes for DEL are application-agnostic and fail to deliver adjustments according to application layer requirements. As a result, they introduce massive excess time and undesired issues such as unfairness and stragglers. Other prior mitigation solutions have significant limitations as they balance data flow rates from workers across paths but often incur imbalanced backlogs when the paths exhibit variance, causing stragglers. To facilitate a more productive DEL, we develop a hybrid multipath TCP (MPTCP) by combining model-based and deep reinforcement learning (DRL) based MPTCP for DEL that strives to realize quicker iteration of DEL and better fairness (by ameliorating stragglers). Hybrid MPTCP essentially integrates two radical TCP developments: i) successful existing model-based MPTCP control strategies and ii) advanced emerging DRL-based techniques, and introduces a novel hybrid MPTCP data transport for easing the communication of the Agg-Avg process. Extensive emulation results demonstrate that the proposed hybrid MPTCP can overcome excess time consumption and ameliorate the application layer unfairness of DEL effectively without injecting additional inconstancy and stragglers.

LGMar 14
Gated Graph Attention Networks for Predicting Duration of Large Scale Power Outages Induced by Natural Disasters

Chenghao Duan, Chuanyi Ji, Anwar Walid et al.

The occurrence of large-scale power outages induced by natural disasters has been on the rise in a changing climate. Such power outages often last extended durations, causing substantial financial losses and socioeconomic impacts to customers. Accurate estimation of outage duration is thus critical for enhancing the resilience of energy infrastructure under severe weather. We formulate such a task as a machine learning (ML) problem with focus on unique real-world challenges: high-order spatial dependency in the data, a moderate number of large-scale outage events, heterogeneous types of such events, and different impacts in a region within each event. To address these challenges, we develop a Bimodal Gated Graph Attention Network (BiGGAT), a graph-based neural network model, that integrates a Graph Attention Network (GAT) with a Gated Recurrent Unit (GRU) to capture the complex spatial characteristics. We evaluate the approach in a setting of inductive learning, using large-scale power outage data from six major hurricanes in the Southeastern United States. Experimental results demonstrate that BiGGAT achieves a superior performance compared to benchmark models.

NIJan 28, 2025Code
Distilling Large Language Models for Network Active Queue Management

Shiva Raj Pokhrel, Deol Satish, Jonathan Kua et al.

The growing complexity of network traffic and demand for ultra-low latency communication require smarter packet traffic management. Existing Deep Learning-based queuing approaches struggle with dynamic network scenarios and demand high engineering effort. We propose AQM-LLM, distilling Large Language Models (LLMs) with few-shot learning, contextual understanding, and pattern recognition to improve Active Queue Management (AQM) [RFC 9330] with minimal manual effort. We consider a specific case where AQM is Low Latency, Low Loss, and Scalable Throughput (L4S) and our design of AQM-LLM builds on speculative decoding and reinforcement-based distilling of LLM by tackling congestion prevention in the L4S architecture using Explicit Congestion Notification (ECN) [RFC 9331] and periodic packet dropping. We develop a new open-source experimental platform by executing L4S-AQM on FreeBSD-14, providing interoperable modules to support LLM integration and facilitate IETF recognition through wider testing. Our extensive evaluations show L4S-LLM enhances queue management, prevents congestion, reduces latency, and boosts network performance, showcasing LLMs' adaptability and efficiency in uplifting AQM systems.

LGFeb 21, 2024
FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing

Xiao-Yang Liu, Jie Zhang, Guoxuan Wang et al.

Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width). Most of LLMs' parameters come from the linear layers of the transformer structure and are highly redundant. These linear layers contribute more than 80% of the computation workload and 99% of the model size. To pretrain and finetune LLMs efficiently, there are three major challenges to address: 1) reducing redundancy of the linear layers; 2) reducing GPU memory footprint; 3) improving GPU utilization when using distributed training. Prior methods, such as LoRA and QLoRA, utilized low-rank matrices and quantization to reduce the number of trainable parameters and model size, respectively. However, the resulting model still consumes a large amount of GPU memory. In this paper, we present high-performance GPU-based methods that exploit low-rank structures to pretrain and finetune LLMs for financial applications. We replace one conventional linear layer of the transformer structure with two narrower linear layers, which allows us to reduce the number of parameters by several orders of magnitude. By quantizing the parameters into low precision (8-bit and 4-bit), the memory consumption of the resulting model is further reduced. Compared with existing LLMs, our methods achieve a speedup of 1.3X and a model compression ratio of 2.64X for pretaining without accuracy drop. For finetuning, our methods achieve an average accuracy increase of 6.3% and 24.0% in general tasks and financial tasks, respectively, and GPU memory consumption ratio of 6.3X. The sizes of our models are smaller than 0.59 GB, allowing inference on a smartphone.

LGDec 11, 2021
ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning

Xiao-Yang Liu, Zechu Li, Zhuoran Yang et al.

Deep reinforcement learning (DRL) has revolutionized learning and actuation in applications such as game playing and robotic control. The cost of data collection, i.e., generating transitions from agent-environment interactions, remains a major challenge for wider DRL adoption in complex real-world problems. Following a cloud-native paradigm to train DRL agents on a GPU cloud platform is a promising solution. In this paper, we present a scalable and elastic library ElegantRL-podracer for cloud-native deep reinforcement learning, which efficiently supports millions of GPU cores to carry out massively parallel training at multiple levels. At a high-level, ElegantRL-podracer employs a tournament-based ensemble scheme to orchestrate the training process on hundreds or even thousands of GPUs, scheduling the interactions between a leaderboard and a training pool with hundreds of pods. At a low-level, each pod simulates agent-environment interactions in parallel by fully utilizing nearly 7,000 GPU CUDA cores in a single GPU. Our ElegantRL-podracer library features high scalability, elasticity and accessibility by following the development principles of containerization, microservices and MLOps. Using an NVIDIA DGX SuperPOD cloud, we conduct extensive experiments on various tasks in locomotion and stock trading and show that ElegantRL-podracer substantially outperforms RLlib. Our codes are available on GitHub.

LGFeb 25, 2021
Emerging Trends in Federated Learning: From Model Fusion to Federated X Learning

Shaoxiong Ji, Yue Tan, Teemu Saravirta et al.

Federated learning is a new learning paradigm that decouples data collection and model training via multi-party computation and model aggregation. As a flexible learning setting, federated learning has the potential to integrate with other learning frameworks. We conduct a focused survey of federated learning in conjunction with other learning algorithms. Specifically, we explore various learning algorithms to improve the vanilla federated averaging algorithm and review model fusion methods such as adaptive aggregation, regularization, clustered methods, and Bayesian methods. Following the emerging trends, we also discuss federated learning in the intersection with other learning paradigms, termed federated X learning, where X includes multitask learning, meta-learning, transfer learning, unsupervised learning, and reinforcement learning. In addition to reviewing state-of-the-art studies, this paper also identifies key challenges and applications in this field, while also highlighting promising future directions.

LGDec 13, 2020
Privacy-preserving Decentralized Aggregation for Federated Learning

Beomyeol Jeon, S. M. Ferdous, Muntasir Raihan Rahman et al.

Federated learning is a promising framework for learning over decentralized data spanning multiple regions. This approach avoids expensive central training data aggregation cost and can improve privacy because distributed sites do not have to reveal privacy-sensitive data. In this paper, we develop a privacy-preserving decentralized aggregation protocol for federated learning. We formulate the distributed aggregation protocol with the Alternating Direction Method of Multiplier (ADMM) and examine its privacy weakness. Unlike prior work that use Differential Privacy or homomorphic encryption for privacy, we develop a protocol that controls communication among participants in each round of aggregation to minimize privacy leakage. We establish its privacy guarantee against an honest-but-curious adversary. We also propose an efficient algorithm to construct such a communication pattern, inspired by combinatorial block design theory. Our secure aggregation protocol based on this novel group communication pattern design leads to an efficient algorithm for federated training with privacy guarantees. We evaluate our federated training algorithm on image classification and next-word prediction applications over benchmark datasets with 9 and 15 distributed sites. Evaluation results show that our algorithm performs comparably to the standard centralized federated learning method while preserving privacy; the degradation in test accuracy is only up to 0.73%.

LGMar 21, 2020
Dynamic Sampling and Selective Masking for Communication-Efficient Federated Learning

Shaoxiong Ji, Wenqi Jiang, Anwar Walid et al.

Federated learning (FL) is a novel machine learning setting that enables on-device intelligence via decentralized training and federated optimization. Deep neural networks' rapid development facilitates the learning techniques for modeling complex problems and emerges into federated deep learning under the federated setting. However, the tremendous amount of model parameters burdens the communication network with a high load of transportation. This paper introduces two approaches for improving communication efficiency by dynamic sampling and top-$k$ selective masking. The former controls the fraction of selected client models dynamically, while the latter selects parameters with top-$k$ largest values of difference for federated updating. Experiments on convolutional image classification and recurrent language modeling are conducted on three public datasets to show our proposed methods' effectiveness.

LGJun 12, 2019
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks

Ming Zhu, Xiao-Yang Liu, Anwar Walid

Unmanned aerial vehicles (UAVs) are envisioned to complement the 5G communication infrastructure in future smart cities. Hot spots easily appear in road intersections, where effective communication among vehicles is challenging. UAVs may serve as relays with the advantages of low price, easy deployment, line-of-sight links, and flexible mobility. In this paper, we study a UAV-assisted vehicular network where the UAV jointly adjusts its transmission control (power and channel) and 3D flight to maximize the total throughput. First, we formulate a Markov decision process (MDP) problem by modeling the mobility of the UAV/vehicles and the state transitions. Secondly, we solve the target problem using a deep reinforcement learning method, namely, the deep deterministic policy gradient (DDPG), and propose three solutions with different control objectives. Deep reinforcement learning methods obtain the optimal policy through the interactions with the environment without knowing the environment variables. Considering that environment variables in our problem are unknown and unmeasurable, we choose a deep reinforcement learning method to solve it. Moreover, considering the energy consumption of 3D flight, we extend the proposed solutions to maximize the total throughput per unit energy. To encourage or discourage the UAV's mobility according to its prediction, the DDPG framework is modified, where the UAV adjusts its learning rate automatically. Thirdly, in a simplified model with small state space and action space, we verify the optimality of proposed algorithms. Comparing with two baseline schemes, we demonstrate the effectiveness of proposed algorithms in a realistic model.

LGDec 3, 2018
Deep Reinforcement Learning for Intelligent Transportation Systems

Xiao-Yang Liu, Zihan Ding, Sem Borst et al.

Intelligent Transportation Systems (ITSs) are envisioned to play a critical role in improving traffic flow and reducing congestion, which is a pervasive issue impacting urban areas around the globe. Rapidly advancing vehicular communication and edge cloud computation technologies provide key enablers for smart traffic management. However, operating viable real-time actuation mechanisms on a practically relevant scale involves formidable challenges, e.g., policy iteration and conventional Reinforcement Learning (RL) techniques suffer from poor scalability due to state space explosion. Motivated by these issues, we explore the potential for Deep Q-Networks (DQN) to optimize traffic light control policies. As an initial benchmark, we establish that the DQN algorithms yield the "thresholding" policy in a single-intersection. Next, we examine the scalability properties of DQN algorithms and their performance in a linear network topology with several intersections along a main artery. We demonstrate that DQN algorithms produce intelligent behavior, such as the emergence of "greenwave" patterns, reflecting their ability to learn favorable traffic light actuations.

LGNov 19, 2018
Practical Deep Reinforcement Learning Approach for Stock Trading

Xiao-Yang Liu, Zhuoran Xiong, Shan Zhong et al.

Stock trading strategy plays a crucial role in investment companies. However, it is challenging to obtain optimal strategy in the complex and dynamic stock market. We explore the potential of deep reinforcement learning to optimize stock trading strategy and thus maximize investment return. 30 stocks are selected as our trading stocks and their daily prices are used as the training and trading market environment. We train a deep reinforcement learning agent and obtain an adaptive trading strategy. The agent's performance is evaluated and compared with Dow Jones Industrial Average and the traditional min-variance portfolio allocation strategy. The proposed deep reinforcement learning approach is shown to outperform the two baselines in terms of both the Sharpe ratio and cumulative returns.

LGNov 18, 2018
Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis

Weijun Lu, Xiao-Yang Liu, Qingwei Wu et al.

We propose a novel multilinear dynamical system (MLDS) in a transform domain, named $\mathcal{L}$-MLDS, to model tensor time series. With transformations applied to a tensor data, the latent multidimensional correlations among the frontal slices are built, and thus resulting in the computational independence in the transform domain. This allows the exact separability of the multi-dimensional problem into multiple smaller LDS problems. To estimate the system parameters, we utilize the expectation-maximization (EM) algorithm to determine the parameters of each LDS. Further, $\mathcal{L}$-MLDSs significantly reduce the model parameters and allows parallel processing. Our general $\mathcal{L}$-MLDS model is implemented based on different transforms: discrete Fourier transform, discrete cosine transform and discrete wavelet transform. Due to the nonlinearity of these transformations, $\mathcal{L}$-MLDS is able to capture the nonlinear correlations within the data unlike the MLDS \cite{rogers2013multilinear} which assumes multi-way linear correlations. Using four real datasets, the proposed $\mathcal{L}$-MLDS is shown to achieve much higher prediction accuracy than the state-of-the-art MLDS and LDS with an equal number of parameters under different noise models. In particular, the relative errors are reduced by $50\% \sim 99\%$. Simultaneously, $\mathcal{L}$-MLDS achieves an exponential improvement in the model's training time than MLDS.

ITDec 13, 2017
Multidimensional Data Tensor Sensing for RF Tomographic Imaging

Tao Deng, Xiao-Yang Liu, Feng Qian et al.

Radio-frequency (RF) tomographic imaging is a promising technique for inferring multi-dimensional physical space by processing RF signals traversed across a region of interest. However, conventional RF tomography schemes are generally based on vector compressed sensing, which ignores the geometric structures of the target spaces and leads to low recovery precision. The recently proposed transform-based tensor model is more appropriate for sensory data processing, as it helps exploit the geometric structures of the three-dimensional target and improve the recovery precision. In this paper, we propose a novel tensor sensing approach that achieves highly accurate estimation for real-world three-dimensional spaces. First, we use the transform-based tensor model to formulate a tensor sensing problem, and propose a fast alternating minimization algorithm called Alt-Min. Secondly, we drive an algorithm which is optimized to reduce memory and computation requirements. Finally, we present evaluation of our Alt-Min approach using IKEA 3D data and demonstrate significant improvement in recovery error and convergence speed compared to prior tensor-based compressed sensing.