DBApr 17, 2020Code
Automated System Performance Testing at MongoDBHenrik Ingo, David Daly
Distributed Systems Infrastructure (DSI) is MongoDB's framework for running fully automated system performance tests in our Continuous Integration (CI) environment. To run in CI it needs to automate everything end-to-end: provisioning and deploying multi-node clusters, executing tests, tuning the system for repeatable results, and collecting and analyzing the results. Today DSI is MongoDB's most used and most useful performance testing tool. It runs almost 200 different benchmarks in daily CI, and we also use it for manual performance investigations. As we can alert the responsible engineer in a timely fashion, all but one of the major regressions were fixed before the 4.2.0 release. We are also able to catch net new improvements, of which DSI caught 17. We open sourced DSI in March 2020.
SEMar 1, 2020
Change Point Detection in Software Performance TestingDavid Daly, William Brown, Henrik Ingo et al.
We describe our process for automatic detection of performance changes for a software product in the presence of noise. A large collection of tests run periodically as changes to our software product are committed to our source repository, and we would like to identify the commits responsible for performance regressions. Previously, we relied on manual inspection of time series graphs to identify significant changes. That was later replaced with a threshold-based detection system, but neither system was sufficient for finding changes in performance in a timely manner. This work describes our recent implementation of a change point detection system built upon the E-Divisive means algorithm. The algorithm produces a list of change points representing significant changes from a given history of performance results. A human reviews the list of change points for actionable changes, which are then triaged for further inspection. Using change point detection has had a dramatic impact on our ability to detect performance changes. Quantitatively, it has dramatically dropped our false positive rate for performance changes, while qualitatively it has made the entire performance evaluation process easier, more productive (ex. catching smaller regressions), and more timely.