MapReplay: Trace-Driven Benchmark Generation for Java HashMap

Filippo Schiavio, Andrea RosÃ, Júnior Löff, Lubomír Bulej, Petr Tůma, Walter Binder

arXiv:2603.1401948.6h-index: 18

AI Analysis

This addresses the problem of inefficient and unrealistic benchmarking for Java HashMap optimizations, though it is incremental as it builds on existing application benchmarks.

The authors tackled the challenge of evaluating Java HashMap performance by proposing MapReplay, a methodology that traces API usages to generate replay workloads, which reproduced application-level trends and reduced experimentation time.

Hash-based maps, particularly java.util.HashMap, are pervasive in Java applications and the JVM, making their performance critical. Evaluating optimizations is challenging because performance depends on factors such as operation patterns, key distributions, and resizing behavior. Microbenchmarks are fast and repeatable but often oversimplify workloads, failing to capture the realistic usage patterns. Application benchmarks (e.g., DaCapo, Renaissance) provide realistic usages but are more expensive to run, prone to variability, and dominated by non-HashMap computations, making map-related performance changes difficult to observe. To address this challenge, we propose MapReplay, a benchmarking methodology that combines the realism of application benchmarks with the efficiency of microbenchmarks. MapReplay traces HashMap API usages generating a replay workload that reproduces the same operation sequence while faithfully reconstructing internal map states. This enables realistic and efficient evaluation of alternative implementations under realistic usage patterns. Applying MapReplay to DaCapo-Chopin and Renaissance, the resulting suite, MapReplayBench, reproduces application-level performance trends while reducing experimentation time and revealing insights difficult to obtain from full benchmarks.

View on arXiv PDF

Similar