Quantifiable metrics (benchmarks) on the usage of header-only c++ libraries

Summary (notable points):

  • Two packages benchmarked (one with 78 compilation units, one with 301 compilation units)
  • Traditional Compiling (Multi Unit Compilation) resulted in a 7% faster application (in the 78 unit package); no change in application runtime in the 301 unit package.
  • Both Traditional Compiling and Header-only benchmarks used the same amount of memory when running (in both packages).
  • Header-only Compiling (Single Unit Compilation) resulted in an executable size that was 10% smaller in the 301 unit package (only 1% smaller in the 78 unit package).
  • Traditional Compiling used about a third of the memory to build over both packages.
  • Traditional Compiling took three times as long to compile (on the first compilation) and took only 4% of the time on recompile (as header-only has to recompile the all sources).
  • Traditional Compiling took longer to link on both the first compilation and subsequent compilations.

Box2D benchmark, data:

box2d_data_gcc.csv

Botan benchmark, data:

botan_data_gcc.csv

Box2D SUMMARY (78 Units)

enter image description here

Botan SUMMARY (301 Units)

enter image description here

NICE CHARTS:

Box2D executable size:

Box2D executable size

Box2D compile/link/build/run time:

Box2D compile/link/build/run time

Box2D compile/link/build/run max memory usage:

Box2D compile/link/build/run max memory usage

Botan executable size:

Botan executable size

Botan compile/link/build/run time:

Botan compile/link/build/run time

Botan compile/link/build/run max memory usage:

Botan compile/link/build/run max memory usage


Benchmark Details

TL;DR


The projects tested, Box2D and Botan were chosen because they are potentially computationally expensive, contain a good number of units, and actually had few or no errors compiling as a single unit. Many other projects were attempted but were consuming too much time to “fix” into compiling as one unit. The memory footprint is measured by polling the memory footprint at regular intervals and using the maximum, and thus might not be fully accurate.

Also, this benchmark does not do automatic header dependency generation (to detect header changes). In a project using a different build system, this may add time to all benchmarks.

There are 3 compilers in the benchmark, each with 5 configurations.

Compilers:

  • gcc
  • icc
  • clang

Compiler configurations:

  • Default – default compiler options
  • Optimized native – -O3 -march=native
  • Size optimized – -Os
  • LTO/IPO native – -O3 -flto -march=native with clang and gcc, -O3 -ipo -march=native with icpc/icc
  • Zero optimization – -Os

I think these each can have different bearings on the comparisons between single-unit and multi-unit builds. I included LTO/IPO so we might see how the “proper” way to achieve single-unit-effectiveness compares.

Explanation of csv fields:

  • Test Name – name of the benchmark. Examples: Botan, Box2D.
  • Test Configuration – name a particular configuration of this test (special cxx flags etc.). Usually the same as Test Name.
  • Compiler – name of the compiler used. Examples: gcc,icc,clang.
  • Compiler Configuration – name of a configuration of compiler options used. Example: gcc opt native
  • Compiler Version String – first line of output of compiler version from the compiler itself. Example: g++ --version produces g++ (GCC) 4.6.1 on my system.
  • Header only – a value of True if this test case was built as a single unit, False if it was built as a multi-unit project.
  • Units – number of units in the test case, even if it is built as a single unit.
  • Compile Time,Link Time,Build Time,Run Time – as it sounds.
  • Re-compile Time AVG,Re-compile Time MAX,Re-link Time AVG,Re-link Time MAX,Re-build Time AVG,Re-build Time MAX – the times across rebuilding the project after touching a single file. Each unit is touched, and for each, the project is rebuilt. The maximum times, and average times are recorded in these fields.
  • Compile Memory,Link Memory,Build Memory,Run Memory,Executable Size – as they sound.

To reproduce the benchmarks:

  • The bullwork is run.py.
  • Requires psutil (for memory footprint measurements).
  • Requires GNUMake.
  • As it is, requires gcc, clang, icc/icpc in the path. Can be modified to remove any of these of course.
  • Each benchmark should have a data-file that lists the units of that benchmarks. run.py will then create two test cases, one with each unit compiled separately, and one with each unit compiled together. Example: box2d.data. The file format is defined as a json string, containing a dictionary with the following keys
    • "units" – a list of c/cpp/cc files that make up the units of this project
    • "executable" – A name of the executable to be compiled.
    • "link_libs" – A space separated list of installed libraries to link to.
    • "include_directores" – A list of directories to include in the project.
    • "command" – optional. special command to execute to run the benchmark. For example, "command": "botan_test --benchmark"
  • Not all C++ projects can this be easily done with; there must be no conflicts/ambiguities in the single unit.
  • To add a project to the test cases, modify the list test_base_cases in run.py with the information for the project, including the data file name.
  • If everything runs well, the output file data.csv should contain the benchmark results.

To produce the bar charts:

  • You should start with a data.csv file produced by the benchmark.
  • Get chart.py. Requires matplotlib.
  • Adjust the fields list to decide which graphs to produce.
  • Run python chart.py data.csv.
  • A file, test.png should now contain the result.

Box2D

  • Box2D was used from svn as is, revision 251.
  • The benchmark was taken from here, modified here and might not be representative of a good Box2D benchmark, and it might not use enough of Box2D to do this compiler benchmark justice.
  • The box2d.data file was manually written, by finding all the .cpp units.

Botan

  • Using Botan-1.10.3.
  • Data file: botan_bench.data.
  • First ran ./configure.py --disable-asm --with-openssl --enable-modules=asn1,benchmark,block,cms,engine,entropy,filters,hash,kdf,mac,bigint,ec_gfp,mp_generic,numbertheory,mutex,rng,ssl,stream,cvc, this generates the header files and Makefile.
  • I disabled assembly, because assembly might intefere with optimizations that can occure when the function boundaries do not block optimization. However, this is conjecture and might be totally wrong.
  • Then ran commands like grep -o "\./src.*cpp" Makefile and grep -o "\./checks.*" Makefile to obtain the .cpp units and put them into botan_bench.data file.
  • Modified /checks/checks.cpp to not call the x509 unit tests, and removed x509 check, because of conflict between Botan typedef and openssl.
  • The benchmark included in the Botan source was used.

System specs:

  • OpenSuse 11.4, 32-bit
  • 4GB RAM
  • Intel(R) Core(TM) i7 CPU Q 720 @ 1.60GHz

Leave a Comment