Scalable Microservice Forensics and Stability Assessment Using Variational Autoencoders
This provides a scalable solution for stability assessment and forensics in large container ecosystems, which are currently unmonitorable by conventional techniques, representing a domain-specific advancement.
The paper tackles the problem of monitoring runtime stability and performing forensics in containerized applications by using variational autoencoders to learn stable patterns and adaptively publish forensic data, resulting in a 2 orders of magnitude CPU performance improvement, 3 orders of magnitude reduction in network transport, and 4 orders of magnitude reduction in storage costs compared to conventional methods.
We present a deep learning based approach to containerized application runtime stability analysis, and an intelligent publishing algorithm that can dynamically adjust the depth of process-level forensics published to a backend incident analysis repository. The approach applies variational autoencoders (VAEs) to learn the stable runtime patterns of container images, and then instantiates these container-specific VAEs to implement stability detection and adaptive forensics publishing. In performance comparisons using a 50-instance container workload, a VAE-optimized service versus a conventional eBPF-based forensic publisher demonstrates 2 orders of magnitude (OM) CPU performance improvement, a 3 OM reduction in network transport volume, and a 4 OM reduction in Elasticsearch storage costs. We evaluate the VAE-based stability detection technique against two attacks, CPUMiner and HTTP-flood attack, finding that it is effective in isolating both anomalies. We believe this technique provides a novel approach to integrating fine-grained process monitoring and digital-forensic services into large container ecosystems that today simply cannot be monitored by conventional techniques