Java performance evaluation through rigorous replay compilation

TitleJava performance evaluation through rigorous replay compilation
Publication TypeConference Paper
Year of Publication2008
AuthorsGeorges, A, Eeckhout L, Buytaert D
Conference NameProceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-60558-215-3
Keywordsbenchmarking, Java, matched-pair comparison, performance evaluation, replay compilation, virtual machine
Abstract

A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, non-determinism due to timer-based sampling for JIT optimization, thread scheduling, and various system effects further complicate the Java performance benchmarking process.

Replay compilation is a recently introduced Java performance analysis methodology that aims at controlling non-determinism to improve experimental repeatability. The key idea of replay compilation is to control the compilation load during experimentation by inducing a pre-recorded compilation plan at replay time. Replay compilation also enables teasing apart performance effects of the application versus the virtual machine.

This paper argues that in contrast to current practice which uses a single compilation plan at replay time, multiple compilation plans add statistical rigor to the replay compilation methodology. By doing so, replay compilation better accounts for the variability observed in compilation load across compilation plans. In addition, we propose matched-pair comparison for statistical data analysis. Matched-pair comparison considers the performance measurements per compilation plan before and after an innovation of interest as a pair, which enables limiting the number of compilation plans needed for accurate performance analysis compared to statistical analysis assuming unpaired measurements.

URLhttp://doi.acm.org/10.1145/1449764.1449794
DOIhttp://doi.acm.org/10.1145/1449764.1449794
Refereed DesignationRefereed