BenchCouncil's View on Benchmarking AI and Other Emerging Workloads
This work addresses the problem of fragmented and inconsistent benchmarking practices for researchers and practitioners in AI and related fields, but it is incremental as it builds on existing benchmarking concepts.
The paper outlines BenchCouncil's perspective on benchmarking modern workloads such as Big Data and AI, identifying challenges as FIDSS and proposing PRDAERS rules for benchmarks to be paper-and-pencil, relevant, diverse, and scalable.
This paper outlines BenchCouncil's view on the challenges, rules, and vision of benchmarking modern workloads like Big Data, AI or machine learning, and Internet Services. We conclude the challenges of benchmarking modern workloads as FIDSS (Fragmented, Isolated, Dynamic, Service-based, and Stochastic), and propose the PRDAERS benchmarking rules that the benchmarks should be specified in a paper-and-pencil manner, relevant, diverse, containing different levels of abstractions, specifying the evaluation metrics and methodology, repeatable, and scaleable. We believe proposing simple but elegant abstractions that help achieve both efficiency and general-purpose is the final target of benchmarking in future, which may be not pressing. In the light of this vision, we shortly discuss BenchCouncil's related projects.