SEApr 15

Log-based vs Graph-based Approaches to Fault Diagnosis

arXiv:2604.140199.4h-index: 4
AI Analysis

For AIOps researchers, this provides a benchmark comparison showing when graph-augmented architectures outperform log-based methods, though the gains are incremental.

This paper compares log-based (e.g., BERT) and graph-based (e.g., GNN) models for fault diagnosis, finding that graph-only models underperform log encoders, but combining both achieves the best results on TraceBench and BGL datasets.

Modern distributed systems generate large volumes of logs that can be analyzed to support essential AIOps tasks such as fault diagnosis, which plays a crucial role in maintaining system reliability. Most existing approaches rely on log-based models that treat logs as linear sequences of events. However, such representations discard the structural context between events that are often present in execution logs, such as parent-child dependencies, fan-out (branching), or temporal features. To better capture these relationships, recent works on Graph Neural Networks (GNNs) suggest that representing logs as graphs offers a promising alternative. Building on these observations, this paper conducts a comparative study of log-based encoder architectures (e.g., BERT) and graph-based models (e.g., GNNs) for automated fault diagnosis. We evaluate our models on TraceBench, a trace-oriented log dataset, and on BGL, a more traditional system log dataset, covering both anomaly detection and fault type classification. Our results show that graph-only models fail to outperform encoder baselines. However, integrating learned representations from log encoders into graph-based models achieves the strongest overall performance. These findings highlight conditions under which graph-augmented architectures can outperform traditional log-based approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes