SEAICEJul 23, 2024

A Comprehensive Survey on Root Cause Analysis in (Micro) Services: Methodologies, Challenges, and Trends

arXiv:2408.00803v122 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

It addresses the problem of system stability and rapid recovery for microservices architectures, but it is incremental as it synthesizes existing research rather than introducing new methods.

This survey tackles the challenge of identifying root causes of issues in microservices due to complex dependencies and propagative faults, providing a comprehensive review of methodologies, challenges, and trends in root cause analysis techniques.

The complex dependencies and propagative faults inherent in microservices, characterized by a dense network of interconnected services, pose significant challenges in identifying the underlying causes of issues. Prompt identification and resolution of disruptive problems are crucial to ensure rapid recovery and maintain system stability. Numerous methodologies have emerged to address this challenge, primarily focusing on diagnosing failures through symptomatic data. This survey aims to provide a comprehensive, structured review of root cause analysis (RCA) techniques within microservices, exploring methodologies that include metrics, traces, logs, and multi-model data. It delves deeper into the methodologies, challenges, and future trends within microservices architectures. Positioned at the forefront of AI and automation advancements, it offers guidance for future research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes