Xiaoping Zhang

h-index85

13papers

443citations

Novelty50%

AI Score38

Ranked #87,633 of 194,257 authors (top 45%)#5,414 in AI (top 43%)

13 Papers

8.6DCApr 5, 2022

SAFARI: Sparsity enabled Federated Learning with Limited and Unreliable Communications

Yuzhu Mao, Zihao Zhao, Meilin Yang et al.

Federated learning (FL) enables edge devices to collaboratively learn a model in a distributed fashion. Many existing researches have focused on improving communication efficiency of high-dimensional models and addressing bias caused by local updates. However, most of FL algorithms are either based on reliable communications or assume fixed and known unreliability characteristics. In practice, networks could suffer from dynamic channel conditions and non-deterministic disruptions, with time-varying and unknown characteristics. To this end, in this paper we propose a sparsity enabled FL framework with both communication efficiency and bias reduction, termed as SAFARI. It makes novel use of a similarity among client models to rectify and compensate for bias that is resulted from unreliable communications. More precisely, sparse learning is implemented on local clients to mitigate communication overhead, while to cope with unreliable communications, a similarity-based compensation method is proposed to provide surrogates for missing model updates. We analyze SAFARI under bounded dissimilarity and with respect to sparse models. It is demonstrated that SAFARI under unreliable communications is guaranteed to converge at the same rate as the standard FedAvg with perfect communications. Implementations and evaluations on CIFAR-10 dataset validate the effectiveness of SAFARI by showing that it can achieve the same convergence speed and accuracy as FedAvg with perfect communications, with up to 80% of the model weights being pruned and a high percentage of client updates missing in each round.

6.6TRMar 15, 2023

Optimizing Trading Strategies in Quantitative Markets using Multi-Agent Reinforcement Learning

Hengxi Zhang, Zhendong Shi, Yuanquan Hu et al.

Quantitative markets are characterized by swift dynamics and abundant uncertainties, making the pursuit of profit-driven stock trading actions inherently challenging. Within this context, reinforcement learning (RL), which operates on a reward-centric mechanism for optimal control, has surfaced as a potentially effective solution to the intricate financial decision-making conundrums presented. This paper delves into the fusion of two established financial trading strategies, namely the constant proportion portfolio insurance (CPPI) and the time-invariant portfolio protection (TIPP), with the multi-agent deep deterministic policy gradient (MADDPG) framework. As a result, we introduce two novel multi-agent RL (MARL) methods, CPPI-MADDPG and TIPP-MADDPG, tailored for probing strategic trading within quantitative markets. To validate these innovations, we implemented them on a diverse selection of 100 real-market shares. Our empirical findings reveal that the CPPI-MADDPG and TIPP-MADDPG strategies consistently outpace their traditional counterparts, affirming their efficacy in the realm of quantitative trading.

1.2MAAug 8, 2023

Cooperative Multi-Type Multi-Agent Deep Reinforcement Learning for Resource Management in Space-Air-Ground Integrated Networks

Hengxi Zhang, Huaze Tang, Wenbo Ding et al.

The Space-Air-Ground Integrated Network (SAGIN), integrating heterogeneous devices including low earth orbit (LEO) satellites, unmanned aerial vehicles (UAVs), and ground users (GUs), holds significant promise for advancing smart city applications. However, resource management of the SAGIN is a challenge requiring urgent study in that inappropriate resource management will cause poor data transmission, and hence affect the services in smart cities. In this paper, we develop a comprehensive SAGIN system that encompasses five distinct communication links and propose an efficient cooperative multi-type multi-agent deep reinforcement learning (CMT-MARL) method to address the resource management issue. The experimental results highlight the efficacy of the proposed CMT-MARL, as evidenced by key performance indicators such as the overall transmission rate and transmission success rate. These results underscore the potential value and feasibility of future implementation of the SAGIN.

10.3SOC-PHFeb 8, 2024Code

Social Physics Informed Diffusion Model for Crowd Simulation

Hongyi Chen, Jingtao Ding, Yong Li et al.

Crowd simulation holds crucial applications in various domains, such as urban planning, architectural design, and traffic arrangement. In recent years, physics-informed machine learning methods have achieved state-of-the-art performance in crowd simulation but fail to model the heterogeneity and multi-modality of human movement comprehensively. In this paper, we propose a social physics-informed diffusion model named SPDiff to mitigate the above gap. SPDiff takes both the interactive and historical information of crowds in the current timeframe to reverse the diffusion process, thereby generating the distribution of pedestrian movement in the subsequent timeframe. Inspired by the well-known social physics model, i.e., Social Force, regarding crowd dynamics, we design a crowd interaction module to guide the denoising process and further enhance this module with the equivariant properties of crowd interactions. To mitigate error accumulation in long-term simulations, we propose a multi-frame rollout training algorithm for diffusion modeling. Experiments conducted on two real-world datasets demonstrate the superior performance of SPDiff in terms of macroscopic and microscopic evaluation metrics. Code and appendix are available at https://github.com/tsinghua-fib-lab/SPDiff.

7.9LGFeb 2, 2024Code

Few-Shot Class-Incremental Learning with Prior Knowledge

Wenhao Jiang, Duo Li, Menghan Hu et al.

To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase. The role of pre-trained model in shaping the effectiveness of incremental learning is frequently underestimated in these studies. Therefore, to enhance the generalization ability of the pre-trained model, we propose Learning with Prior Knowledge (LwPK) by introducing nearly free prior knowledge from a few unlabeled data of subsequent incremental classes. We cluster unlabeled incremental class samples to produce pseudo-labels, then jointly train these with labeled base class samples, effectively allocating embedding space for both old and new class data. Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting, with theoretical analysis based on empirical risk minimization and class distance measurement corroborating its operational principles. The source code of LwPK is publicly available at: \url{https://github.com/StevenJ308/LwPK}.

31.0CLApr 22, 2024

A Survey on Efficient Inference for Large Language Models

Zixuan Zhou, Xuefei Ning, Ke Hong et al. · tsinghua

Large Language Models (LLMs) have attracted extensive attention due to their remarkable performance across various tasks. However, the substantial computational and memory requirements of LLM inference pose challenges for deployment in resource-constrained scenarios. Efforts within the field have been directed towards developing techniques aimed at enhancing the efficiency of LLM inference. This paper presents a comprehensive survey of the existing literature on efficient LLM inference. We start by analyzing the primary causes of the inefficient LLM inference, i.e., the large model size, the quadratic-complexity attention operation, and the auto-regressive decoding approach. Then, we introduce a comprehensive taxonomy that organizes the current literature into data-level, model-level, and system-level optimization. Moreover, the paper includes comparative experiments on representative methods within critical sub-fields to provide quantitative insights. Last but not least, we provide some knowledge summary and discuss future research directions.

2.0CVJan 9, 2024Code

Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning

Kuo Yang, Duo Li, Menghan Hu et al.

For semi-supervised learning with imbalance classes, the long-tailed distribution of data will increase the model prediction bias toward dominant classes, undermining performance on less frequent classes. Existing methods also face challenges in ensuring the selection of sufficiently reliable pseudo-labels for model training and there is a lack of mechanisms to adjust the selection of more reliable pseudo-labels based on different training stages. To mitigate this issue, we introduce uncertainty into the modeling process for pseudo-label sampling, taking into account that the model performance on the tailed classes varies over different training stages. For example, at the early stage of model training, the limited predictive accuracy of model results in a higher rate of uncertain pseudo-labels. To counter this, we propose an Uncertainty-Aware Dynamic Threshold Selection (UDTS) approach. This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes. Compared to other methods such as the baseline method FixMatch, UDTS achieves an increase in accuracy of at least approximately 5.26%, 1.75%, 9.96%, and 1.28% on the natural scene image datasets CIFAR10-LT, CIFAR100-LT, STL-10-LT, and the medical image dataset TissueMNIST, respectively. The source code of UDTS is publicly available at: https://github.com/yangk/UDTS.

28.2CLApr 1, 2025Code

Z1: Efficient Test-time Scaling with Code

Zhaojian Yu, Yinghao Wu, Yilun Zhao et al.

Large Language Models (LLMs) can achieve enhanced complex problem-solving through test-time computing scaling, yet this often entails longer contexts and numerous reasoning token costs. In this paper, we propose an efficient test-time scaling method that trains LLMs on code-related reasoning trajectories, facilitating their reduction of excess thinking tokens while maintaining performance. First, we create Z1-Code-Reasoning-107K, a curated dataset of simple and complex coding problems paired with their short and long solution trajectories. Second, we present a novel Shifted Thinking Window to mitigate overthinking overhead by removing context-delimiting tags (e.g., <think>. . . </think>) and capping reasoning tokens. Trained with long and short trajectory data and equipped with Shifted Thinking Window, our model, Z1-7B, demonstrates the ability to adjust its reasoning level as the complexity of problems and exhibits efficient test-time scaling across different reasoning tasks that matches R1-Distill-Qwen-7B performance with about 30% of its average thinking tokens. Notably, fine-tuned with only code trajectories, Z1-7B demonstrates generalization to broader reasoning tasks (47.5% on GPQA Diamond). Our analysis of efficient reasoning elicitation also provides valuable insights for future research.

21.0SEDec 30, 2024Code

HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

Zhaojian Yu, Yilun Zhao, Arman Cohan et al.

We introduce self-invoking code generation, a new task designed to evaluate the progressive reasoning and problem-solving capabilities of LLMs. In this task, models are presented with a base problem and a related, more complex problem. They must solve the base problem and then utilize its solution to address the more complex one. This work features three key contributions. First, we propose a general recipe for generating more challenging versions of existing benchmarks, resulting in three new benchmarks: HumanEval Pro, MBPP Pro, and BigCodeBench-Lite Pro, specifically designed to assess LLMs on self-invoking code generation. Second, from the analysis of experimental results over twenty LLMs on our benchmarks, we have two important observations: (i) Most LLMs excel in traditional code generation benchmarks like HumanEval and MBPP, but their performance declines on self-invoking tasks. For example, o1-mini achieves 96.2% pass@1 on HumanEval but only 76.2% on HumanEval Pro. (ii) On self-invoking code generation task, the instruction-tuned models demonstrate only marginal improvements compared to the base models. Third, we disclose the types of failure modes that exist in our evaluation results. All these results underscore the need for further advancements in self-invoking code generation tasks and provide a new direction for future research on enhancing LLMs' code reasoning capabilities.

2.3CYMay 21, 2024Code

OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models

Zhaojian Yu, Yinghao Wu, Zhuotao Deng et al.

In recent years, large-scale auto-regressive models have made significant progress in various tasks, such as text or video generation. However, the environmental impact of these models has been largely overlooked, with a lack of assessment and analysis of their carbon footprint. To address this gap, we introduce OpenCarbonEval, a unified framework for integrating large-scale models across diverse modalities to predict carbon emissions, which could provide AI service providers and users with a means to estimate emissions beforehand and help mitigate the environmental pressure associated with these models. In OpenCarbonEval, we propose a dynamic throughput modeling approach that could capture workload and hardware fluctuations in the training process for more precise emissions estimates. Our evaluation results demonstrate that OpenCarbonEval can more accurately predict training emissions than previous methods, and can be seamlessly applied to different modal tasks. Specifically, we show that OpenCarbonEval achieves superior performance in predicting carbon emissions for both visual models and language models. By promoting sustainable AI development and deployment, OpenCarbonEval can help reduce the environmental impact of large-scale models and contribute to a more environmentally responsible future for the AI community.

2.3AINov 7, 2024

Discern-XR: An Online Classifier for Metaverse Network Traffic

Yoga Suhas Kuruba Manjunath, Austin Wissborn, Mathew Szymanowski et al.

In this paper, we design an exclusive Metaverse network traffic classifier, named Discern-XR, to help Internet service providers (ISP) and router manufacturers enhance the quality of Metaverse services. Leveraging segmented learning, the Frame Vector Representation (FVR) algorithm and Frame Identification Algorithm (FIA) are proposed to extract critical frame-related statistics from raw network data having only four application-level features. A novel Augmentation, Aggregation, and Retention Online Training (A2R-OT) algorithm is proposed to find an accurate classification model through online training methodology. In addition, we contribute to the real-world Metaverse dataset comprising virtual reality (VR) games, VR video, VR chat, augmented reality (AR), and mixed reality (MR) traffic, providing a comprehensive benchmark. Discern-XR outperforms state-of-the-art classifiers by 7% while improving training efficiency and reducing false-negative rates. Our work advances Metaverse network traffic classification by standing as the state-of-the-art solution.

2.3AINov 7, 2024

ResLearn: Transformer-based Residual Learning for Metaverse Network Traffic Prediction

Yoga Suhas Kuruba Manjunath, Mathew Szymanowski, Austin Wissborn et al.

Our work proposes a comprehensive solution for predicting Metaverse network traffic, addressing the growing demand for intelligent resource management in eXtended Reality (XR) services. We first introduce a state-of-the-art testbed capturing a real-world dataset of virtual reality (VR), augmented reality (AR), and mixed reality (MR) traffic, made openly available for further research. To enhance prediction accuracy, we then propose a novel view-frame (VF) algorithm that accurately identifies video frames from traffic while ensuring privacy compliance, and we develop a Transformer-based progressive error-learning algorithm, referred to as ResLearn for Metaverse traffic prediction. ResLearn significantly improves time-series predictions by using fully connected neural networks to reduce errors, particularly during peak traffic, outperforming prior work by 99%. Our contributions offer Internet service providers (ISPs) robust tools for real-time network management to satisfy Quality of Service (QoS) and enhance user experience in the Metaverse.

11.7ASJan 3, 2020

Re-synchronization using the Hand Preceding Model for Multi-modal Fusion in Automatic Continuous Cued Speech Recognition

Li Liu, Gang Feng, Denis Beautemps et al.

Cued Speech (CS) is an augmented lip reading complemented by hand coding, and it is very helpful to the deaf people. Automatic CS recognition can help communications between the deaf people and others. Due to the asynchronous nature of lips and hand movements, fusion of them in automatic CS recognition is a challenging problem. In this work, we propose a novel re-synchronization procedure for multi-modal fusion, which aligns the hand features with lips feature. It is realized by delaying hand position and hand shape with their optimal hand preceding time which is derived by investigating the temporal organizations of hand position and hand shape movements in CS. This re-synchronization procedure is incorporated into a practical continuous CS recognition system that combines convolutional neural network (CNN) with multi-stream hidden markov model (MSHMM). A significant improvement of about 4.6\% has been achieved retaining 76.6\% CS phoneme recognition correctness compared with the state-of-the-art architecture (72.04\%), which did not take into account the asynchrony of multi-modal fusion in CS. To our knowledge, this is the first work to tackle the asynchronous multi-modal fusion in the automatic continuous CS recognition.