NL-CPS: Reinforcement Learning-Based Kubernetes Control Plane Placement in Multi-Region Clusters
This addresses a critical deployment challenge for Kubernetes users in heterogeneous, multi-region environments, though it appears incremental as it builds on existing RL methods for a specific bottleneck.
The paper tackles the problem of suboptimal Kubernetes control-plane node placement in multi-region clusters by introducing a reinforcement learning framework based on neural contextual bandits that learns optimal placement policies from infrastructure characteristics, demonstrating substantial performance improvements over baseline approaches.
The placement of Kubernetes control-plane nodes is critical to ensuring cluster reliability, scalability, and performance, and therefore represents a significant deployment challenge in heterogeneous, multi-region environments. Existing initialisation procedures typically select control-plane hosts arbitrarily, without considering node resource capacity or network topology, often leading to suboptimal cluster performance and reduced resilience. Given Kubernetes's status as the de facto standard for container orchestration, there is a need to rigorously evaluate how control-plane node placement influences the overall performance of the cluster operating across multiple regions. This paper advances this goal by introducing an intelligent methodology for selecting control-plane node placement across dynamically selected Cloud-Edge resources spanning multiple regions, as part of an automated orchestration system. More specifically, we propose a reinforcement learning framework based on neural contextual bandits that observes operational performance and learns optimal control-plane placement policies from infrastructure characteristics. Experimental evaluation across several geographically distributed regions and multiple cluster configurations demonstrates substantial performance improvements over several baseline approaches.