Zijian Hu

h-index13

8papers

112citations

Novelty50%

AI Score43

Ranked #55,679 of 194,257 authors (top 29%)#3,351 in AI (top 27%)

8 Papers

18.0CRJun 8, 2023

FedSecurity: Benchmarking Attacks and Defenses in Federated Learning and Federated LLMs

Shanshan Han, Baturalp Buyukates, Zijian Hu et al.

This paper introduces FedSecurity, an end-to-end benchmark that serves as a supplementary component of the FedML library for simulating adversarial attacks and corresponding defense mechanisms in Federated Learning (FL). FedSecurity eliminates the need for implementing the fundamental FL procedures, e.g., FL training and data loading, from scratch, thus enables users to focus on developing their own attack and defense strategies. It contains two key components, including FedAttacker that conducts a variety of attacks during FL training, and FedDefender that implements defensive mechanisms to counteract these attacks. FedSecurity has the following features: i) It offers extensive customization options to accommodate a broad range of machine learning models (e.g., Logistic Regression, ResNet, and GAN) and FL optimizers (e.g., FedAVG, FedOPT, and FedNOVA); ii) it enables exploring the effectiveness of attacks and defenses across different datasets and models; and iii) it supports flexible configuration and customization through a configuration file and some APIs. We further demonstrate FedSecurity's utility and adaptability through federated training of Large Language Models (LLMs) to showcase its potential on a wide range of complex applications.

16.4DCJul 23, 2024

ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency

Yuhang Yao, Han Jin, Alay Dilipbhai Shah et al.

Large language models (LLMs) have surged in popularity and are extensively used in commercial applications, where the efficiency of model serving is crucial for the user experience. Most current research focuses on optimizing individual sub-procedures, e.g. local inference and communication, however, there is no comprehensive framework that provides a holistic system view for optimizing LLM serving in an end-to-end manner. In this work, we conduct a detailed analysis to identify major bottlenecks that impact end-to-end latency in LLM serving systems. Our analysis reveals that a comprehensive LLM serving endpoint must address a series of efficiency bottlenecks that extend beyond LLM inference. We then propose ScaleLLM, an optimized system for resource-efficient LLM serving. Our extensive experiments reveal that with 64 concurrent requests, ScaleLLM achieves a 4.3x speed up over vLLM and outperforms state-of-the-arts with 1.5x higher throughput.

7.7AIJul 4, 2022

Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments

Zijian Hu, Xiaoguang Gao, Kaifang Wan et al.

Unmanned aerial vehicles (UAVs) have been widely used in military warfare. In this paper, we formulate the autonomous motion control (AMC) problem as a Markov decision process (MDP) and propose an advanced deep reinforcement learning (DRL) method that allows UAVs to execute complex tasks in large-scale dynamic three-dimensional (3D) environments. To overcome the limitations of the prioritized experience replay (PER) algorithm and improve performance, the proposed asynchronous curriculum experience replay (ACER) uses multithreads to asynchronously update the priorities, assigns the true priorities and applies a temporary experience pool to make available experiences of higher quality for learning. A first-in-useless-out (FIUO) experience pool is also introduced to ensure the higher use value of the stored experiences. In addition, combined with curriculum learning (CL), a more reasonable training paradigm of sampling experiences from simple to difficult is designed for training UAVs. By training in a complex unknown environment constructed based on the parameters of a real UAV, the proposed ACER improves the convergence speed by 24.66\% and the convergence result by 5.59\% compared to the state-of-the-art twin delayed deep deterministic policy gradient (TD3) algorithm. The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the ACER agent.

5.3LGMar 4, 2023

Demonstration-guided Deep Reinforcement Learning for Coordinated Ramp Metering and Perimeter Control in Large Scale Networks

Zijian Hu, Wei Ma

Effective traffic control methods have great potential in alleviating network congestion. Existing literature generally focuses on a single control approach, while few studies have explored the effectiveness of integrated and coordinated control approaches. This study considers two representative control approaches: ramp metering for freeways and perimeter control for homogeneous urban roads, and we aim to develop a deep reinforcement learning (DRL)-based coordinated control framework for large-scale networks. The main challenges are 1) there is a lack of efficient dynamic models for both freeways and urban roads; 2) the standard DRL method becomes ineffective due to the complex and non-stationary network dynamics. In view of this, we propose a novel meso-macro dynamic network model and first time develop a demonstration-guided DRL method to achieve large-scale coordinated ramp metering and perimeter control. The dynamic network model hybridizes the link and generalized bathtub models to depict the traffic dynamics of freeways and urban roads, respectively. For the DRL method, we incorporate demonstration to guide the DRL method for better convergence by introducing the concept of "teacher" and "student" models. The teacher models are traditional controllers (e.g., ALINEA, Gating), which provide control demonstrations. The student models are DRL methods, which learn from the teacher and aim to surpass the teacher's performance. To validate the proposed framework, we conduct two case studies in a small-scale network and a real-world large-scale traffic network in Hong Kong. The research outcome reveals the great potential of combining traditional controllers with DRL for coordinated control in large-scale networks.

20.9AIMay 28

DeepSurvey: Enhancing Analytical Depth and Citation Reliability in Automated Survey Generation

Ziyue Yang, Da Ma, Hanqi Li et al.

As scientific literature grows rapidly, automated survey generation has become a key capability for AI scientists and human researchers. However, existing systems suffer from limited analytical depth due to reliance on abstracts and isolated paper processing, and unreliable citations from imprecise retrieval and post-hoc grounding, producing superficial surveys and may mislead researchers. We present DeepSurvey, an agentic system that addresses both. To enhance depth, DeepSurvey extracts structured keynotes from full-text papers, models cross-paper relationships through clustering and comparative analysis, and integrates code-repository analysis to recover implementation-level details. To fortify reliability, it combines citation-graph expansion with hybrid filtering for topic-focussed retrieval, enforces evidence-constrained citation assignment, and deploys multi-granularity agentic refinement to validate citation-claim alignment. Experiments show that DeepSurvey achieves the highest content score (8.644/10) and citation quality (12.3% and 9.3% recall and precision gains over the strongest baseline), generalizes more robustly across domains (0.14 vs 0.22 to 0.69 CS-to-non-CS drop), and is preferred over human-written surveys by domain experts (83.3% overall quality, 100% content depth).

1.8LGAug 11, 2022

Heterogeneous Line Graph Transformer for Math Word Problems

Zijian Hu, Meng Jiang

This paper describes the design and implementation of a new machine learning model for online learning systems. We aim at improving the intelligent level of the systems by enabling an automated math word problem solver which can support a wide range of functions such as homework correction, difficulty estimation, and priority recommendation. We originally planned to employ existing models but realized that they processed a math word problem as a sequence or a homogeneous graph of tokens. Relationships between the multiple types of tokens such as entity, unit, rate, and number were ignored. We decided to design and implement a novel model to use such relational data to bridge the information gap between human-readable language and machine-understandable logical form. We propose a heterogeneous line graph transformer (HLGT) model that constructs a heterogeneous line graph via semantic role labeling on math word problems and then perform node representation learning aware of edge types. We add numerical comparison as an auxiliary task to improve model training for real-world use. Experimental results show that the proposed model achieves a better performance than existing models and suggest that it is still far below human performance. Information utilization and knowledge discovery is continuously needed to improve the online learning systems.

4.2AINov 7, 2024

Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs

Yide Ran, Zhaozhuo Xu, Yuhang Yao et al.

The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance. However, challenges such as data scarcity, ineffective question formatting, and catastrophic forgetting hinder the development of on-device LLM agents. To tackle these issues, we propose Alopex, a framework that enables precise on-device function calls using the Fox LLM. Alopex introduces a logic-based method for generating high-quality training data and a novel ``description-question-output'' format for fine-tuning, reducing risks of function information leakage. Additionally, a data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks. Experimental results show that Alopex improves function call accuracy and significantly reduces catastrophic forgetting, providing a robust solution for integrating function call capabilities into LLMs without manual intervention.

3.7CVOct 29, 2021Code

Turning Traffic Monitoring Cameras into Intelligent Sensors for Traffic Density Estimation

Zijian Hu, William H. K. Lam, S. C. Wong et al.

Accurate traffic state information plays a pivotal role in the Intelligent Transportation Systems (ITS), and it is an essential input to various smart mobility applications such as signal coordination and traffic flow prediction. The current practice to obtain the traffic state information is through specialized sensors such as loop detectors and speed cameras. In most metropolitan areas, traffic monitoring cameras have been installed to monitor the traffic conditions on arterial roads and expressways, and the collected videos or images are mainly used for visual inspection by traffic engineers. Unfortunately, the data collected from traffic monitoring cameras are affected by the 4L characteristics: Low frame rate, Low resolution, Lack of annotated data, and Located in complex road environments. Therefore, despite the great potentials of the traffic monitoring cameras, the 4L characteristics hinder them from providing useful traffic state information (e.g., speed, flow, density). This paper focuses on the traffic density estimation problem as it is widely applicable to various traffic surveillance systems. To the best of our knowledge, there is a lack of the holistic framework for addressing the 4L characteristics and extracting the traffic density information from traffic monitoring camera data. In view of this, this paper proposes a framework for estimating traffic density using uncalibrated traffic monitoring cameras with 4L characteristics. The proposed framework consists of two major components: camera calibration and vehicle detection. The camera calibration method estimates the actual length between pixels in the images and videos, and the vehicle counts are extracted from the deep-learning-based vehicle detection method. Combining the two components, high-granular traffic density can be estimated. To validate the proposed framework, two case studies were conducted in Hong Kong and Sacramento. The results show that the Mean Absolute Error (MAE) in camera calibration is less than 0.2 meters out of 6 meters, and the accuracy of vehicle detection under various conditions is approximately 90%. Overall, the MAE for the estimated density is 9.04 veh/km/lane in Hong Kong and 1.30 veh/km/lane in Sacramento. The research outcomes can be used to calibrate the speed-density fundamental diagrams, and the proposed framework can provide accurate and real-time traffic information without installing additional sensors.