Zhiyuan Yao

h-index23

7papers

18citations

Novelty37%

AI Score22

Ranked #178,073 of 194,257 authors (top 92%)#11,000 in AI (top 88%)

7 Papers

4.5AIJun 25, 2022

Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation

Zhiyuan Yao, Tianyu Shi, Site Li et al.

Axie infinity is a complicated card game with a huge-scale action space. This makes it difficult to solve this challenge using generic Reinforcement Learning (RL) algorithms. We propose a hybrid RL framework to learn action representations and game strategies. To avoid evaluating every action in the large feasible action set, our method evaluates actions in a fixed-size set which is determined using action representations. We compare the performance of our method with the other two baseline methods in terms of their sample efficiency and the winning rates of the trained models. We empirically show that our method achieves an overall best winning rate and the best sample efficiency among the three methods.

6.2AIJun 3, 2022Code

Learning Distributed and Fair Policies for Network Load Balancing as Markov Potential Game

Zhiyuan Yao, Zihan Ding

This paper investigates the network load balancing problem in data centers (DCs) where multiple load balancers (LBs) are deployed, using the multi-agent reinforcement learning (MARL) framework. The challenges of this problem consist of the heterogeneous processing architecture and dynamic environments, as well as limited and partial observability of each LB agent in distributed networking systems, which can largely degrade the performance of in-production load balancing algorithms in real-world setups. Centralised-training-decentralised-execution (CTDE) RL scheme has been proposed to improve MARL performance, yet it incurs -- especially in distributed networking systems, which prefer distributed and plug-and-play design scheme -- additional communication and management overhead among agents. We formulate the multi-agent load balancing problem as a Markov potential game, with a carefully and properly designed workload distribution fairness as the potential function. A fully distributed MARL algorithm is proposed to approximate the Nash equilibrium of the game. Experimental evaluations involve both an event-driven simulator and real-world system, where the proposed MARL load balancing algorithm shows close-to-optimal performance in simulations, and superior results over in-production LBs in the real-world system.

3.3CEJul 2, 2024

CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications

Yupeng Cao, Zhiyuan Yao, Zhi Chen et al.

The integration of Large Language Models (LLMs) into financial analysis has garnered significant attention in the NLP community. This paper presents our solution to IJCAI-2024 FinLLM challenge, investigating the capabilities of LLMs within three critical areas of financial tasks: financial classification, financial text summarization, and single stock trading. We adopted Llama3-8B and Mistral-7B as base models, fine-tuning them through Parameter Efficient Fine-Tuning (PEFT) and Low-Rank Adaptation (LoRA) approaches. To enhance model performance, we combine datasets from task 1 and task 2 for data fusion. Our approach aims to tackle these diverse tasks in a comprehensive and integrated manner, showcasing LLMs' capacity to address diverse and complex financial tasks with improved accuracy and decision-making capabilities.

2.3NIJun 2, 2023

A Hybrid Approach for Smart Alert Generation

Yao Zhao, Sophine Zhang, Zhiyuan Yao

Anomaly detection is an important task in network management. However, deploying intelligent alert systems in real-world large-scale networking systems is challenging when we take into account (i) scalability, (ii) data heterogeneity, and (iii) generalizability and maintainability. In this paper, we propose a hybrid model for an alert system that combines statistical models with a whitelist mechanism to tackle these challenges and reduce false positive alerts. The statistical models take advantage of a large database to detect anomalies in time-series data, while the whitelist filters out persistently alerted nodes to further reduce false positives. Our model is validated using qualitative data from customer support cases. Future work includes more feature engineering and input data, as well as including human feedback in the model development process.

3.3TRMar 28, 2024

Reinforcement Learning in Agent-Based Market Simulation: Unveiling Realistic Stylized Facts and Behavior

Zhiyuan Yao, Zheng Li, Matthew Thomas et al.

Investors and regulators can greatly benefit from a realistic market simulator that enables them to anticipate the consequences of their decisions in real markets. However, traditional rule-based market simulators often fall short in accurately capturing the dynamic behavior of market participants, particularly in response to external market impact events or changes in the behavior of other participants. In this study, we explore an agent-based simulation framework employing reinforcement learning (RL) agents. We present the implementation details of these RL agents and demonstrate that the simulated market exhibits realistic stylized facts observed in real-world markets. Furthermore, we investigate the behavior of RL agents when confronted with external market impacts, such as a flash crash. Our findings shed light on the effectiveness and adaptability of RL-based agents within the simulation, offering insights into their response to significant market events.

4.6LGFeb 1, 2024

Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach

Zhiyuan Yao, Ionut Florescu, Chihoon Lee

In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games.

1.2NIFeb 1, 2024

Develop End-to-End Anomaly Detection System

Emanuele Mengoli, Zhiyuan Yao, Wutao Wei

Anomaly detection plays a crucial role in ensuring network robustness. However, implementing intelligent alerting systems becomes a challenge when considering scenarios in which anomalies can be caused by both malicious and non-malicious events, leading to the difficulty of determining anomaly patterns. The lack of labeled data in the computer networking domain further exacerbates this issue, impeding the development of robust models capable of handling real-world scenarios. To address this challenge, in this paper, we propose an end-to-end anomaly detection model development pipeline. This framework makes it possible to consume user feedback and enable continuous user-centric model performance evaluation and optimization. We demonstrate the efficacy of the framework by way of introducing and bench-marking a new forecasting model -- named \emph{Lachesis} -- on a real-world networking problem. Experiments have demonstrated the robustness and effectiveness of the two proposed versions of \emph{Lachesis} compared with other models proposed in the literature. Our findings underscore the potential for improving the performance of data-driven products over their life cycles through a harmonized integration of user feedback and iterative development.