Scene-Aware Latency Estimation for Microservices via Multi-Scale Graph Fusion
This work addresses the need for accurate latency estimation in microservice systems to enable efficient proactive autoscaling, which is critical for cloud providers to meet service quality guarantees.
The paper tackles the challenge of accurately estimating end-to-end latency in cloud-native microservice architectures for proactive autoscaling. The proposed MSGAF framework, using multi-scale graph fusion and scene-aware learning, significantly outperforms state-of-the-art methods across diverse operational scenarios, though no concrete numbers are provided.
Cloud-Native microservice architectures have become prevalent owing to their inherent flexibility and scalability properties. To satisfy service quality guarantees, cloud providers must implement efficient proactive autoscaling algorithms. However, effective proactive scaling critically depends on accurately estimating end-to-end latency under given resource quotas, which remains highly challenging. Existing methods struggle with the multi-hierarchical nature and dynamic operational contexts of microservice systems. They primarily employ single-scale modeling that fails to capture inherent organizational structures and lacks adaptability to varying workload types. To address these limitations, we propose MSGAF, a Multi-Scale Graph Adaptive Fusion framework with Scene-Aware Learning for microservice latency estimation. Our approach constructs hierarchical graph representations through learnable aggregation-based coarsening, capturing system behaviors across microscopic, mesoscopic, and macroscopic levels. The framework comprises three components: a system state encoding module transforming heterogeneous monitoring data into unified representations, a multi-scale graph adaptive fusion module leveraging graph attention networks for hierarchical feature extraction, and a scene-aware learning module employing specialized expert networks with dynamic weight allocation for context-specific estimation. Additionally, we design and implement a comprehensive non-intrusive monitoring system for real-time data collection. Extensive experiments on benchmark microservice applications demonstrate that MSGAF significantly outperforms state-of-the-art methods across diverse operational scenarios, providing substantial improvements for cloud-native performance optimization.