This system has various scripts for analysing the output of VM benchmarking. Any system can be used to perform the benchmarking, e.g. Krun.
Run ./build.sh to build warmup_stats.
User should directly call the bin/warmup_stats, which is a front-end to other
scripts in bin/. warmup_stats takes either CSV files or
Krun results files as input. As output it can
create HTML or LaTeX / PDF tables and diffs, or PDF plots.
warmup_stats uses the following terminology:
-
A process execution is the execution of a single operating system process. In other words, it is equivalent to running a program from the command-line and waiting for it to terminate.
-
An in-process iteration is a single iteration of a benchmark within a process execution. In other words, a single process execution executes many in-process iterations.
The bin/warmup_stats script can take CSV files as input. The format is as
follows. The first row must contain a header for each required column: the
first column must be the process execution index (conventionally a number
0...n, though this is not enforced), the second column the benchmark name,
and n subsequent columns for the n in-process iterations run (each a time
in seconds). Each of these columns can be given arbitrary names, but the order
is vital, and the number of in-process iterations must be the same for each
process execution. An example is as follows:
process_exec_idx, bench_name, 0, 1, 2, ...
0, spectral norm, 0.2, 0.1, 0.4, ...
1, spectral norm, 0.3, 0.15, 0.2, ...
When processing CSV with warmup_stats, the --language, --vm and
--uname flags must be specified so that plots can contain the relevant
information, though users can pass arbitrary data to each flag. Note that
these flags are not needed with Krun results files.
The --output-plots <file.pdf> flag converts input data into visual plots.
If the input files are in CSV format, bin/warmup_stats also needs the names of
the language and VM under test, and the output of uname -a on the machine the
benchmarks were run on.
Example usage:
bin/warmup_stats --output-plots plots.pdf -l javascript -v V8 -u "`uname -a`" results.csv
bin/warmup_stats --output-plots plots.pdf results.json.bz2The --output-table <file> flag converts input data into an HTML table or a
LaTeX / PDF table. Conversion to PDF requires pdflatex to be installed.
If the input files are in CSV format, bin/warmup_stats also needs the names of
the language and VM under test, and the output of uname -a on the machine the
benchmarks were run on.
Example usage (LaTeX / PDF):
bin/warmup_stats --tex --output-table table.tex -l javascript -v V8 -u "`uname -a`" results.csv
bin/warmup_stats --tex --output-table table.tex results.json.bz2Example usage (HTML):
bin/warmup_stats --html --output-table table.html -l javascript -v V8 -u "`uname -a`" results.csv
bin/warmup_stats --html --output-table table.html results.json.bz2By default, warmup_stats produces high quality statistics, which can take
considerable time. If you want to quickly experiment with things, you can use
the --quality low switch: this makes warmup_stats run considerably quicker,
but does lead to lower quality (and thus less reliable) statistics being
produced. Although the differences are often fairly minor, we do not encourage
the use of --quality low when formally publishing benchmark results.
Benchmarking is often performed in order to test whether a change in a given VM improves or worsens its performance. Unfortunately, the difference between benchmark performance before and after a change is rarely simple. Users will want to produce a detailed comparison of the results in Krun results tables (above) in order to get a deeper insight into the effects of their changes.
The --output-diff flag converts data from exactly two CSV files into an HTML
table or a LaTeX / PDF table. Conversion to PDF requires pdflatex to be
installed.
If the input files are in CSV format, bin/warmup_stats also needs the names of
the language and VM under test, and the output of uname -a on the machine the
benchmarks were run on.
Example usage (LaTeX / PDF):
bin/warmup_stats --tex --output-diff diff.tex -l javascript -v V8 -u "`uname -a`" before.csv after.csv
bin/warmup_stats --tex --output-diff diff.tex before.json.bz2 after.json.bz2Example usage (HTML):
bin/warmup_stats --html --output-diff diff.html -l javascript -v V8 -u "`uname -a`" before.csv after.csv
bin/warmup_stats --html --output-diff diff.html before.json.bz2 after.json.bz2The resulting table will contain results from the after.{csv,json.bz2} file,
compared against the before.{csv,json.bz2} file. VMs and benchmarks that do
not appear in both CSV results files will be omitted from the table.