AIWC: OpenCL-based Architecture-Independent Workload Characterisation
This work addresses the need for developers and HPC system designers to understand and optimize application performance across different architectures, though it is incremental as it builds on existing simulation tools.
The authors tackled the problem of workload characterization being tied to specific architectures by developing AIWC, the first architecture-independent framework for heterogeneous platforms, which simulates OpenCL devices to provide metrics for performance prediction and optimization guidance.
Measuring performance-critical characteristics of application workloads is important both for developers, who must understand and optimize the performance of codes, as well as designers and integrators of HPC systems, who must ensure that compute architectures are suitable for the intended workloads. However, if these workload characteristics are tied to architectural features that are specific to a particular system, they may not generalize well to alternative or future systems. An architecture-independent method ensures an accurate characterization of inherent program behaviour, without bias due to architecture-dependent features that vary widely between different types of accelerators. This work presents the first architecture- independent workload characterization framework for heterogeneous compute platforms, proposing a set of metrics determining the suitability and performance of an application on any parallel HPC architecture. The tool, AIWC, is a plugin for the open-source Oclgrind simulator. It supports parallel workloads and is capable of characterizing OpenCL codes currently in use in the supercomputing setting. AIWC simulates an OpenCL device by directly interpreting LLVM instructions, and the resulting metrics may be used for performance prediction and developer feedback to guide device-specific optimizations. An evaluation of the metrics collected over a subset of the Extended OpenDwarfs Benchmark Suite is also presented.