Recovery command generation towards automatic recovery in ICT systems by Seq2Seq learning
This work addresses the time-consuming manual operation in ICT system recovery, offering an incremental improvement by applying existing Seq2Seq methods to a new domain-specific task.
The paper tackles the problem of automating recovery commands in ICT systems by proposing a Seq2Seq neural network model that learns from past logs and executed commands, achieving high accuracy in estimating recovery commands on synthetic and OpenStack datasets.
With the increase in scale and complexity of ICT systems, their operation increasingly requires automatic recovery from failures. Although it has become possible to automatically detect anomalies and analyze root causes of failures with current methods, making decisions on what commands should be executed to recover from failures still depends on manual operation, which is quite time-consuming. Toward automatic recovery, we propose a method of estimating recovery commands by using Seq2Seq, a neural network model. This model learns complex relationships between logs obtained from equipment and recovery commands that operators executed in the past. When a new failure occurs, our method estimates plausible commands that recover from the failure on the basis of collected logs. We conducted experiments using a synthetic dataset and realistic OpenStack dataset, demonstrating that our method can estimate recovery commands with high accuracy.