Filippo Schiavio

PL
3papers
Novelty33%
AI Score38

3 Papers

PLMay 22
Misleading Microbenchmarks on the Java Virtual Machines

Filippo Schiavio, Lubomír Bulej, Walter Binder

Developers often use microbenchmarks to choose the most performant implementation of a method or a class. On the Java Virtual Machine (JVM), this is commonly done using the Java Microbenchmark Harness (JMH) which addresses common pitfalls of measuring code performance on the JVM. However, even using JMH guidelines cannot overcome the fundamental issue of context. Microbenchmarks inherently execute code in isolation, without interference from other application code competing for CPU resources, such as cache or branch-predictor capacity. On managed runtimes with tiered dynamic compilation, such as the JVM, the speculative, profile-driven nature of compilation decisions means that code performance is highly dependent on profiles collected during early execution. Because profiles usually include also branch probabilities and receiver types (besides code hotness metrics), a badly designed microbenchmark may cause the JVM to collect an unrealistic profile, resulting in aggressive, yet misleading, optimizations, that would not occur in a real application. In this paper, we demonstrate how using microbenchmarks under conditions that induce the JVM to collect unrealistic profiles yields misleading results despite following existing guidelines. We also extend these guidelines by suggesting actions to make the microbenchmark results more representative.

PLMay 22
JEDI: Java Evaluation of Declarative and Imperative Queries

Filippo Schiavio, Walter Binder

The Java Stream API aims at increasing developer productivity thanks to an easy-to-read declarative syntax to express computations. It also simplifies parallel computing, providing a high-level abstraction on top of common parallelization aspects. Unfortunately, there is a lack of benchmarks specifically targeting stream-based applications. Such a lack of benchmarks makes it difficult for researchers and developers of the Java class library to optimize the Stream API. Moreover, in the absence of dedicated benchmarks, it is difficult to analyze the performance of streams to suggest developers how to write efficient code using the API. In this work we present JEDI, a benchmark suite that targets the Stream API. JEDI is automatically generated by converting SQL benchmarks into Java benchmarks. Our code generator supports targets different implementations (both stream-based and imperative) for the same query. The ultimate goal of our benchmark suite -- and the main contribution of this work -- is to analyze the performance of the different implementations to spot inefficient code structures and better alternatives, suggesting best practices to Java developers. Among the multiple implementations we generate, we focus on different parallelization strategies and explain the most efficient parallelization strategies based on characteristics of the processed data. Finally, the code generation producing imperative code defines of a baseline that can guide researchers and Java implementers to optimize the Stream API.

PLMar 14
MapReplay: Trace-Driven Benchmark Generation for Java HashMap

Filippo Schiavio, Andrea RosÃ, Júnior Löff et al.

Hash-based maps, particularly java.util.HashMap, are pervasive in Java applications and the JVM, making their performance critical. Evaluating optimizations is challenging because performance depends on factors such as operation patterns, key distributions, and resizing behavior. Microbenchmarks are fast and repeatable but often oversimplify workloads, failing to capture the realistic usage patterns. Application benchmarks (e.g., DaCapo, Renaissance) provide realistic usages but are more expensive to run, prone to variability, and dominated by non-HashMap computations, making map-related performance changes difficult to observe. To address this challenge, we propose MapReplay, a benchmarking methodology that combines the realism of application benchmarks with the efficiency of microbenchmarks. MapReplay traces HashMap API usages generating a replay workload that reproduces the same operation sequence while faithfully reconstructing internal map states. This enables realistic and efficient evaluation of alternative implementations under realistic usage patterns. Applying MapReplay to DaCapo-Chopin and Renaissance, the resulting suite, MapReplayBench, reproduces application-level performance trends while reducing experimentation time and revealing insights difficult to obtain from full benchmarks.