DCHCLGPFJan 26, 2020

A Visual Analytics Framework for Reviewing Streaming Performance Data

arXiv:2001.09399v120 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of real-time performance monitoring for analysts in high-performance computing, though it appears incremental as it builds on existing streaming and visualization techniques.

The authors tackled the challenge of analyzing streaming performance data in extreme-scale parallel computing systems by introducing a visual analytics framework with data management, analysis, and interactive visualization modules, demonstrating its effectiveness in identifying bottlenecks and outliers through a case study on parallel discrete-event simulation.

Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data without missing important changes or patterns. To support streaming data analysis, we introduce a visual analytic framework comprising of three modules: data management, analysis, and interactive visualization. The data management module collects various computing and communication performance metrics from the monitored system using streaming data processing techniques and feeds the data to the other two modules. The analysis module automatically identifies important changes and patterns at the required latency. In particular, we introduce a set of online and progressive analysis methods for not only controlling the computational costs but also helping analysts better follow the critical aspects of the analysis results. Finally, the interactive visualization module provides the analysts with a coherent view of the changes and patterns in the continuously captured performance data. Through a multi-faceted case study on performance analysis of parallel discrete-event simulation, we demonstrate the effectiveness of our framework for identifying bottlenecks and locating outliers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes