NILGJan 29, 2023

A Deep Reinforcement Learning Framework for Optimizing Congestion Control in Data Centers

arXiv:2301.12558v13 citationsh-index: 30
Originality Incremental advance
AI Analysis

This addresses network optimization for data center operators, but it is incremental as it builds on existing protocols like BBR.

The paper tackles the problem of congestion control in data centers by proposing a multiagent reinforcement learning framework to dynamically tune parameters, showing potential to improve network performance metrics like throughput and latency.

Various congestion control protocols have been designed to achieve high performance in different network environments. Modern online learning solutions that delegate the congestion control actions to a machine cannot properly converge in the stringent time scales of data centers. We leverage multiagent reinforcement learning to design a system for dynamic tuning of congestion control parameters at end-hosts in a data center. The system includes agents at the end-hosts to monitor and report the network and traffic states, and agents to run the reinforcement learning algorithm given the states. Based on the state of the environment, the system generates congestion control parameters that optimize network performance metrics such as throughput and latency. As a case study, we examine BBR, an example of a prominent recently-developed congestion control protocol. Our experiments demonstrate that the proposed system has the potential to mitigate the problems of static parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes