jpenuchot / ctbench Goto Github PK

View Code? Open in Web Editor NEW

22.0 3.0 7.0 758 KB

Compiler-assisted variable size benchmarking for the study of C++ metaprogram compile times.

Home Page: https://jpenuchot.github.io/ctbench-docs/

License: MIT License

CMake 16.15% C++ 74.96% Shell 0.76% TeX 8.13%

clang benchmark metaprogramming gcc compilation data-analysis data-visualization

ctbench's Introduction

Compiler-assisted benchmarking for the study of C++ metaprogram compile times.

Github project: https://github.com/jpenuchot/ctbench
Online documentation: https://jpenuchot.github.io/ctbench-docs/
Discord server: https://discord.gg/NvJFFrdS7p

ctbench allows you to declare and generate compile-time benchmark batches for given ranges, run them, aggregate and wrangle Clang profiling data, and plot them.

The project was made to fit the needs of scientific data collection and analysis, thus it is not a one-shot profiler, but a set of tools that enable reproductible data gathering from user-defined, variably sized compile-time benchmarks using Clang's time-trace feature to understand the impact of metaprogramming techniques on compile time. On top of that, ctbench is also able to measure compiler execution time to support compilers that do not have built-in profilers like GCC.

It has two main components: a C++ plotting toolset that can be used as a CLI program and as a library, and a CMake boilerplate library to generate benchmark and graph targets.

The CMake library contains all the boilerplate code to define benchmark targets compatible with the C++ plotting toolset called grapher.

Rule of Cheese can be used as an example project for using ctbench.

Examples

As an example here are benchmark curves from the Poacher project. The benchmark case sources are available here.

Clang ExecuteCompiler time curve from poacher, generated by the compare_by plotter

Clang Total Frontend time curve from poacher, generated by the compare_by plotter

Using ctbench

Build prerequisites

ArchLinux and Ubuntu 23.04 are officially supported as tests are compiled and executed on both of these Linux distributions. Others including Fedora or any other Linux distro that provides CMake 3.25 or higher should be compatible.

Required ArchLinux packages: boost boost-libs catch2 clang cmake curl fmt git llvm llvm-libs ninja nlohmann-json tar tbb unzip zip
Required Ubuntu packages: catch2 clang cmake curl git libboost-all-dev libclang-dev libfmt-dev libllvm15 libtbb-dev libtbb12 llvm llvm-dev ninja-build nlohmann-json3-dev pkg-config tar unzip zip

The Sciplot library is required too. It can be installed on ArchLinux using the sciplot-git AUR package (NB: the non-git package isn't up-to-date). Otherwise, you can install it for your whole system using CMake or locally using vcpkg:

git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
./vcpkg/vcpkg install sciplot fmt

cmake --preset release \
  -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake

Note: The fmt dependency is needed, as vcpkg breaks fmt's CMake integration if you have it already installed.

Installing ctbench

git clone https://github.com/jpenuchot/ctbench
cd ctbench
cmake --preset release
cmake --build --preset release
sudo cmake --build --preset release --target install

An AUR package is available for easier install and update.

Integrating ctbench in your project

ctbench can be integrated to a CMake project using find_package:

find_package(ctbench REQUIRED)

The example project is provided as a reference project for ctbench integration and usage. For more details, an exhaustive CMake API reference is available.

Declaring a benchmark case target

A benchmark case is represented by a C++ file. It will be "instanciated", ie. compiled with BENCHMARK_SIZE defined to values in a range that you provide.

BENCHMARK_SIZE is intended to be used by the preprocessor to generate a benchmark instance of the desired size:

#include <boost/preprocessor/repetition/repeat.hpp>

// First we generate foo<int>().
// foo<int>() uses C++20 requirements to dispatch function calls accross 16
// of its instances, according to the value of its integer template parameter.

#define FOO_MAX 16

#define DECL(z, i, nope)                                                       \
  template <int N>                                                             \
  requires(N % FOO_MAX == i) constexpr int foo() { return N * i; }

BOOST_PP_REPEAT(BENCHMARK_SIZE, DECL, FOO_MAX);
#undef DECL

// Now we generate the sum() function for instanciation

int sum() {
  int i;

#define CALL(z, n, nop) i += foo<n>();
  BOOST_PP_REPEAT(BENCHMARK_SIZE, CALL, i);
#undef CALL
  return i;
}

By default, only compiler execution time is measured. If you want to generate plots using Clang's profiler data, add the following:

add_compile_options(-ftime-trace -ftime-trace-granularity=1)

Note that plotting profiler data takes more time and will generate a lot of plot files.

Then you can declare a benchmark case target in CMake with the following:

ctbench_add_benchmark(function_selection.requires # Benchmark case name
  function_selection-requires.cpp                 # Benchmark case file
  1                                               # Range begin
  32                                              # Range end
  1                                               # Range step
  10)                                             # Iterations per size

Declaring a graph target

Once you have several benchmark cases, you can start writing a graph config.

Example configs can be found here, or by running ctbench-grapher-utils --plotter=<plotter> --command=get-default-config. A list of available plotters can be retrieved by running ctbench-grapher-utils --help.

{
  "plotter": "compare_by",
  "demangle": true,
  "draw_average": true,
  "draw_points": true,
  "key_ptrs": [
    "/name",
    "/args/detail"
  ],
  "legend_title": "Timings",
  "plot_file_extensions": [
    ".svg",
    ".png"
  ],
  "value_ptr": "/dur",
  "width": 1500,
  "height": 500,
  "x_label": "Benchmark size factor",
  "y_label": "Time (µs)"
}

This configuration uses the compare_by plotter. It compares features targeted by the JSON pointers in key_ptrs across all benchmark cases. This is the easiest way to extract and compare as many relevant time-trace features at once.

Back to CMake, you can now declare a graph target using this config to compare the time spent in the compiler execution, the frontend, and the backend between the benchmark cases you declared previously:

ctbench_add_graph(function_selection-feature_comparison-graph # Target name
  ${CONFIGS}/feature_comparison.json                          # Config
  function_selection.enable_if                                # First case
  function_selection.enable_if_t                              # Second case
  function_selection.if_constexpr                             # ...
  function_selection.control
  function_selection.requires)

For each group descriptor, a graph will be generated with one curve per benchmark case. In this case, you would then get 3 graphs (ExecuteCompiler, Frontend, and Backend) each with 5 curves (enable_if, enable_if_t, if_constexpr, control, and requires).

Related work

References

Citing ctbench

@article{Penuchot2023,
  doi = {10.21105/joss.05165},
  url = {https://doi.org/10.21105/joss.05165},
  year = {2023},
  publisher = {The Open Journal},
  volume = {8},
  number = {88},
  pages = {5165},
  author = {Jules Penuchot and Joel Falcou},
  title = {ctbench - compile-time benchmarking and analysis},
  journal = {Journal of Open Source Software},
}

ctbench's People

Contributors

Stargazers

Watchers

Forkers

starflitche viroulep muhammadmoizulhaq corsair-cxs thomas1664

ctbench's Issues

CMake Error: Could not read presets

Trying to use cmake 3.22.1 to run

cmake --preset release

per the installation instructions returns

CMake Error: Could not read presets from /z/ctbench: Unrecognized "version" field

openjournals/joss-reviews#5165

Make prereq software more apparent

It'd be handy if you made a note in the readme that running

sudo apt install nlohmann-json3-dev

installs a necessary prerequisite. Perhaps there are others as well?

openjournals/joss-reviews#5165

Single size benchmarks

ctbench only supports variable size benchmarks, support for single size benchmarks for A/B comparisons would be nice too. It would require work on everything from the CMake API to grapher core data structures, and possibly the CLI too.

Consider removing LLVM dependency

If I edit CMakeLists.txt to read:

cmake_minimum_required(VERSION 3.22)

and then run

cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja ..
ninja

I see

/z/ctbench/grapher/lib/grapher/plotters/debug.cpp:1:10: fatal error: llvm/Support/raw_ostream.h: No such file or directory
    1 | #include <llvm/Support/raw_ostream.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/z/ctbench/grapher/grapher-utils.cpp:1:10: fatal error: llvm/Support/CommandLine.h: No such file or directory
    1 | #include <llvm/Support/CommandLine.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Is it possible to remove this LLVM dependency in favour of using the C++ STL instead? If not, is it possible to make the LLVM linking mechanism more robust?

openjournals/joss-reviews#5165

GCC `-ftime-report` support

Some GCC flags worth checking out:

-ftime-report
-ftime-report-details
-time[=file]

Fix AUR package dependencies

To fix for later:

boost is missing
sciplot should be specified as sciplot-git (the regular sciplot package uses catch2-v2, which can't be installed with the new version)

[PREDICATE] Allow selecting events with a given parent

Adding a predicate that matches events with a given parent. The matching could be done by name, by pointer, or simply by using other predicates to make it modular.

The main sub-issue for this one will be to keep track of event arborescence.

Sciplot submodule?

Is it possible to set sciplot up so that it can be pulled from a submodule or similar without requiring a system-wide install?

openjournals/joss-reviews#5165

Clearer docs

I'm having a hard time finding a good way to organize the documentation into something that's clear and concise. I'm currently not satisfied by the way JSON configuration values are documented as I feel like users have to dig way too much just to find JSON configuration information.

Additionally, the docs aren't clear about what's internal or not.

I'm leaving this issue here as long as I'm not satisfied with way the docs are presented, any suggestion or critics are welcome.

Optimization

nlohmann::json has a nice interface but performance doesn't scale for what we're doing (processing hundreds of megabytes of JSON trace events).

Running ctbench-grapher-plot through perf shows that nlohmann::json-related calls are the ones that account for most overhead.

Making ctbench installable and packageable

I'm not a CMake expert but maybe someone would be able to help making ctbench installable and usable with find_package. The end goal would be to end up with something that's easy to handle with makepkg (AUR), conan, and vcpkg.

CI/CD: Automatic documentation deployment

Having an up-to-date documentation built upon the main branch would be a nice plus. Having it running outside of Github would be even nicer.

Investigate (Microsoft) C++ Build Insights

Seems to be comparable to Clang's time trace, so adapting ctbench to it might be pretty easy.

https://devblogs.microsoft.com/cppblog/introducing-c-build-insights/

I own zero Windows machine at the moment, so any help is welcome.

Thanks @JolyLoic for pinging me about that :)

Generic compiler execution time measurement

One way to easily add (limited) support for GCC would be to measure compiler execution time instead of relying on internal profiling data. This could be useful to compare metaprogram performance scaling across a variety of compilers.

The main issues are:

How to measure compiler execution time from CMake
How to adapt current data wrangling code for a new kind of format

And so far my favorite solutions are:

Adding compiler execution time measurement to the clang time-trace wrapper
Use the wrapper to generate a file in the same format as time-trace files, but only with compiler execution time data

Fix compiler execution time measurement

Compiler execution time measurement through ttw seems to be broken, the time values it reports aren't coherent with Clang's time trace profiler. More works needs to be done to measure compiler execution more precisely, and Hyperfine seems to have that figured out so it's worth having a look and reproduce their measurement method in ttw.

Investigate multitree visualization strategy

Took note from @marcorubini's idea of using a multi tree visualization strategy to display multiple clang traces on a single graph. At the moment this is only a reminder issue for later.

Replace LLVM dependency with Boost

CLI
Demangling

Set of ctbench-provided configs

Just figured that some configs I wrote in https://github.com/JPenuchot/rule-of-cheese/ are generic enough to be provided as examples. I'd like them to be baked in a default config directory, and maybe implement a lookup directory list for configurations.

Review JOSS paper

Hey, please find my review of your tool below:

Regarding the manuscript I don't have major comments, other than that there are many typos/grammatical errors, especially in the summary.
I found the installation of the package a bit tedious:
- there is no description how to install on Mac/Windows. The latter can I guess be excused, but if the package is not easily installable/available on Mac this excludes a significant fraction of the research community,
- to install I needed to use Docker and reproduce what you did here and additionally google how to run ./vcpkg/vcpkg install sciplot fmt on an ARM. I think it would be easiest to provide a Dockerfile that does the exact installation with all dependencies in the example folder (and have a GitHub action that tests on Mac M1).
I think providing install instructions using conda would be helpful and also make it straight-forward to use for Mac users. I think all required dependencies are on there.
I haven't done much C++ development in the last year but seeing how much traction Meson is gaining, I find it unusual that it isn't provided as alternative build system. Would that be easy to offer? Also, is the CMake 3.25 dependency really necessary?
I think providing a self-contained example with exact instructions how the visualisations are created is helpful. I have probably overlooked something, but it's not 100% clear how you made them from a quick look.
Create a CONTRIBUTING file with how-to's on contributing new code.

Cheers,
Simon

cmake too new with conflicting requirements.

I'm on Ubuntu 22.04, which is not that old, and finding that cmake 3.22.1 is too old to meet your CMakeLists.txt requirements. Can these be relaxed?

Furthermore, CMakeLists.txt requires 3.25 while CMakePresets.json requires 3.23. Can you make these consistent?

openjournals/joss-reviews#5165

Explore using ROOT for graph generation

Turning graph saving off makes ctbench-grapher-plot execution a lot shorter (a several seconds vs several minutes). Generating graphs using ROOT instead of Sciplot might be a good way to cut graph generation times by a significant factor.

compare_by plotter: filenames too long for large symbols

The compare_by plotter saves plots with filenames generated using the key itself. This becomes problematic when the symbols grow large enough for filenames to be too long.

The issue is known and currently has no other solution than generating smaller symbols at the moment, eg. benchmark driver functions with names that do not depend on benchmark instantiation size.

Solutions for this problem are welcome. I'm open to having different name shortening strategies, generating index files to help find plot files, and others. I will address this issue if it becomes really problematic.

Automated tests

CI
Test infrastructure
Tests implementation

Maybe: add action for tests compiled with GCC?

jpenuchot / ctbench Goto Github PK

ctbench's Introduction

Examples

Using ctbench

Build prerequisites

Installing ctbench

Integrating ctbench in your project

Declaring a benchmark case target

Declaring a graph target

Related work

References

Citing ctbench

ctbench's People

Contributors

Stargazers

Watchers

Forkers

ctbench's Issues

Recommend Projects

Recommend Topics

Recommend Org