Zili Meng

NI
h-index1
4papers
795citations
Novelty44%
AI Score47

4 Papers

64.8NIApr 24Code
OCC: Physical-Layer Assisted Congestion Control for Real-Time Communications

Yufan Zhuang, Zili Meng, Zehong Lin et al.

Real-time communications (RTC) is a core technology for emerging applications in 6G, such as cloud gaming, teleoperation, and extended reality (XR), which require consistently low latency and high bitrates. Existing RTC solutions fundamentally struggle to maintain low latency while supporting high bitrates due to their reliance on trial-and-error-based mechanisms. These mechanisms fail to probe the available bandwidth (ABW) promptly and accurately, leading to a trade-off between latency reliability and bandwidth utilization. The tension becomes extremely more critical as the cellular bandwidth and application's demand fluctuate with a larger range in cellular networks nowadays. To address this trade-off, we propose OCC, a novel approach that utilizes physical-layer information to explicitly obtain the ABW in real time, enabling rapid adaptation to dynamic wireless network conditions. However, the unique characteristics of RTC, including traffic bursts, application (APP) limits, and encoder lag, make the physical-layer informed control non-trivial. OCC effectively addresses these issues through three innovative strategies: frame-aware bandwidth measurement, APP-limit-aware bandwidth estimation, and encoder-friendly rate control. Extensive over-the-air experiments on an open-source cellular testbed demonstrate that OCC significantly enhances the performance of mobile RTC, reducing tail network latency by $13\%$ to $68\%$ and improving video frame bitrate by $1.2\times$ to $3.5\times$.

NIOct 1, 2025
Make a Video Call with LLM: A Measurement Campaign over Five Mainstream Apps

Jiayang Xu, Xiangjie Huang, Zijie Li et al.

In 2025, Large Language Model (LLM) services have launched a new feature -- AI video chat -- allowing users to interact with AI agents via real-time video communication (RTC), just like chatting with real people. Despite its significance, no systematic study has characterized the performance of existing AI video chat systems. To address this gap, this paper proposes a comprehensive benchmark with carefully designed metrics across four dimensions: quality, latency, internal mechanisms, and system overhead. Using custom testbeds, we further evaluate five mainstream AI video chatbots with this benchmark. This work provides the research community a baseline of real-world performance and identifies unique system bottlenecks. In the meantime, our benchmarking results also open up several research questions for future optimizations of AI video chatbots.

NIOct 9, 2019
Interpreting Deep Learning-Based Networking Systems

Zili Meng, Minhu Wang, Jiasong Bai et al.

While many deep learning (DL)-based networking systems have demonstrated superior performance, the underlying Deep Neural Networks (DNNs) remain blackboxes and stay uninterpretable for network operators. The lack of interpretability makes DL-based networking systems prohibitive to deploy in practice. In this paper, we propose Metis, a framework that provides interpretability for two general categories of networking problems spanning local and global control. Accordingly, Metis introduces two different interpretation methods based on decision tree and hypergraph, where it converts DNN policies to interpretable rule-based controllers and highlight critical components based on analysis over hypergraph. We evaluate Metis over several state-of-the-art DL-based networking systems and show that Metis provides human-readable interpretations while preserving nearly no degradation in performance. We further present four concrete use cases of Metis, showcasing how Metis helps network operators to design, debug, deploy, and ad-hoc adjust DL-based networking systems.

LGOct 3, 2018
Learning Scheduling Algorithms for Data Processing Clusters

Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan et al.

Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load.