LogLM: From Task-based to Instruction-based Automated Log Analysis
This addresses the problem of inefficient software operation and maintenance for system administrators by reducing reliance on task-specific data and deployment burdens, though it is incremental as it builds on existing log analysis paradigms.
The paper tackles the inflexibility and high deployment costs of task-based automated log analysis by proposing LogLM, an instruction-based training approach that integrates multiple tasks into a single model, which outperforms existing methods across five capabilities and shows strong generalization.
Automatic log analysis is essential for the efficient Operation and Maintenance (O&M) of software systems, providing critical insights into system behaviors. However, existing approaches mostly treat log analysis as training a model to perform an isolated task ( e.g., anomaly detection, log parsing, etc.) using task-specific log-label pairs. These task-based approaches are inflexible in generalizing to complex scenarios, depend on task-specific training data, and cost significantly when deploying multiple models. In this paper, we propose an instruction-based training approach that transforms log-label pairs from multiple tasks and domains into a unified format of instruction-response pairs. Our trained model, LogLM, can follow complex user instructions and generalize better across different tasks, thereby increasing flexibility and reducing the dependence on task-specific training data. By integrating major log analysis tasks into a single model, our approach also relieves model deployment burden. Experimentally, LogLM outperforms existing approaches across five log analysis capabilities, and exhibits strong generalization abilities on complex instructions and unseen tasks.