Automating Computer Bottleneck Detection with Belief Nets
This addresses system performance issues for IT administrators, but appears incremental as it builds on existing automated bottleneck detection methods.
The paper tackles the problem of diagnosing bottlenecks in computer systems by applying belief networks to model interactions between workloads, Windows NT, and hardware, using Gaussian distributions for uncertainty, and presents initial diagnostic results.
We describe an application of belief networks to the diagnosis of bottlenecks in computer systems. The technique relies on a high-level functional model of the interaction between application workloads, the Windows NT operating system, and system hardware. Given a workload description, the model predicts the values of observable system counters available from the Windows NT performance monitoring tool. Uncertainty in workloads, predictions, and counter values are characterized with Gaussian distributions. During diagnostic inference, we use observed performance monitor values to find the most probable assignment to the workload parameters. In this paper we provide some background on automated bottleneck detection, describe the structure of the system model, and discuss empirical procedures for model calibration and verification. Part of the calibration process includes generating a dataset to estimate a multivariate Gaussian error model. Initial results in diagnosing bottlenecks are presented.