LGOct 8, 2022

Dynamically meeting performance objectives for multiple services on a service mesh

arXiv:2210.04002v110 citationsh-index: 28
Originality Synthesis-oriented
AI Analysis

This work addresses service management for providers using service meshes, but it is incremental as it applies existing RL methods to a specific domain.

The authors tackled the problem of achieving end-to-end management objectives like delay bounds and throughput for multiple services on a service mesh under varying load, using a reinforcement learning agent with a simulator to speed up learning, resulting in near-optimal control policies validated on a testbed.

We present a framework that lets a service provider achieve end-to-end management objectives under varying load. Dynamic control actions are performed by a reinforcement learning (RL) agent. Our work includes experimentation and evaluation on a laboratory testbed where we have implemented basic information services on a service mesh supported by the Istio and Kubernetes platforms. We investigate different management objectives that include end-to-end delay bounds on service requests, throughput objectives, and service differentiation. These objectives are mapped onto reward functions that an RL agent learns to optimize, by executing control actions, namely, request routing and request blocking. We compute the control policies not on the testbed, but in a simulator, which speeds up the learning process by orders of magnitude. In our approach, the system model is learned on the testbed; it is then used to instantiate the simulator, which produces near-optimal control policies for various management objectives. The learned policies are then evaluated on the testbed using unseen load patterns.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes