PLDCLGMay 31, 2023

PERFOGRAPH: A Numerical Aware Program Graph Representation for Performance Optimization and Program Analysis

arXiv:2306.00210v212 citations
Originality Highly original
AI Analysis

This addresses the problem of limited performance in machine learning-based program analysis and optimization for developers and researchers, offering a novel representation that improves accuracy in tasks like parallelism discovery and configuration prediction.

The paper tackles the challenge of representing programming languages for machine learning in program analysis by proposing PERFOGRAPH, a graph-based representation that captures numerical information and aggregate data structures, resulting in state-of-the-art performance with error rate reductions of 7.4% on AMD and 10% on NVIDIA datasets in the Device Mapping challenge.

The remarkable growth and significant success of machine learning have expanded its applications into programming languages and program analysis. However, a key challenge in adopting the latest machine learning methods is the representation of programming languages, which directly impacts the ability of machine learning methods to reason about programs. The absence of numerical awareness, aggregate data structure information, and improper way of presenting variables in previous representation works have limited their performances. To overcome the limitations and challenges of current program representations, we propose a graph-based program representation called PERFOGRAPH. PERFOGRAPH can capture numerical information and the aggregate data structure by introducing new nodes and edges. Furthermore, we propose an adapted embedding method to incorporate numerical awareness. These enhancements make PERFOGRAPH a highly flexible and scalable representation that effectively captures programs intricate dependencies and semantics. Consequently, it serves as a powerful tool for various applications such as program analysis, performance optimization, and parallelism discovery. Our experimental results demonstrate that PERFOGRAPH outperforms existing representations and sets new state-of-the-art results by reducing the error rate by 7.4% (AMD dataset) and 10% (NVIDIA dataset) in the well-known Device Mapping challenge. It also sets new state-of-the-art results in various performance optimization tasks like Parallelism Discovery and NUMA and Prefetchers Configuration prediction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes