SEJun 10, 2017

Darwinian Data Structure Selection

Michail Basios, Lingbo Li, Fan Wu, Leslie Kanthan, Earl Barr

arXiv:1706.03232v340 citations

AI Analysis

This addresses performance optimization for software developers by automating data structure selection, with incremental improvements across various benchmarks.

The paper tackles the laborious problem of data structure selection and tuning by introducing ARTEMIS, a multi-objective, cloud-based optimization framework that automatically finds optimal Darwinian Data Structures, achieving median improvements of 4.8% in runtime, 10.1% in memory, and 5.1% in CPU usage across 43 Java projects.

Data structure selection and tuning is laborious but can vastly improve an application's performance and memory footprint. Some data structures share a common interface and enjoy multiple implementations. We call them Darwinian Data Structures (DDS), since we can subject their implementations to survival of the fittest. We introduce ARTEMIS a multi-objective, cloud-based search-based optimisation framework that automatically finds optimal, tuned DDS modulo a test suite, then changes an application to use that DDS. ARTEMIS achieves substantial performance improvements for \emph{every} project in $5$ Java projects from DaCapo benchmark, $8$ popular projects and $30$ uniformly sampled projects from GitHub. For execution time, CPU usage, and memory consumption, ARTEMIS finds at least one solution that improves \emph{all} measures for $86\%$ ($37/43$) of the projects. The median improvement across the best solutions is $4.8\%$, $10.1\%$, $5.1\%$ for runtime, memory and CPU usage. These aggregate results understate ARTEMIS's potential impact. Some of the benchmarks it improves are libraries or utility functions. Two examples are gson, a ubiquitous Java serialization framework, and xalan, Apache's XML transformation tool. ARTEMIS improves gson by $16.5$\%, $1\%$ and $2.2\%$ for memory, runtime, and CPU; ARTEMIS improves xalan's memory consumption by $23.5$\%. \emph{Every} client of these projects will benefit from these performance improvements.

View on arXiv PDF

Similar