Kilian Weishaupt · 052608b2
--- a/Micro-benchmarks.md
+++ b/Micro-benchmarks.md
+Benchmarks may help to guide design decisions. It can be useful to test only certain
+components of the code in order to get a clearer picture rather than testing based on an entire
+simulation workflow. This is what micro benchmarks are used for.
+
+ A typical question could be: Which one is faster?
+
+```c++
+auto result = std::pow(number, 3)
+```
+or
+
+```c++
+auto result = number*number*number;
+```
+
+A simple tool for this is the [google benchmark library](https://github.com/google/benchmark).
+
+To answer the above question, we just need two files:
+
+A `main.cc` file
+
+```c++
+// main.cpp
+#include <string>
+#include <cmath>
+#include <dune/common/power.hh>
+
+auto test_std_pow = [](benchmark::State& state, auto input)
+{
+    std::vector<double> numbers(static_cast<int>(std::stof(input.second[1])));
+    // you could also hard code e.g. std::vector<double> numbers(1e6); 
+    
+    // fill vector with numbers
+    std::iota(numbers.begin(), numbers.end(), 0.0);
+
+    // run actual benchmark
+    for (auto _ : state)
+    {
+        // You could also omit the loop and just call std::pow on a fixed number
+        // Using different numbers seems more realistic though
+        for (const auto& number : numbers)
+            benchmark::DoNotOptimize(std::pow(number, 3.0));
+    }
+};
+
+auto test_by_hand = [](benchmark::State& state, auto input)
+{
+    std::vector<double> numbers(static_cast<int>(std::stof(input.second[1])));
+    std::iota(numbers.begin(), numbers.end(), 0.0);
+    for (auto _ : state)
+    {
+        for (const auto& number : numbers)
+            benchmark::DoNotOptimize(number*number*number);
+    }
+};
+
+auto test_dune_power = [](benchmark::State& state, auto input)
+{
+    std::vector<double> numbers(static_cast<int>(std::stof(input.second[1])));
+    std::iota(numbers.begin(), numbers.end(), 0.0);
+    for (auto _ : state)
+    {
+        for (const auto& number : numbers)
+            benchmark::DoNotOptimize(Dune::power(number,3));
+    }
+};
+
+int main(int argc, char** argv) 
+{
+  benchmark::RegisterBenchmark("test_std_pow", test_std_pow, std::make_pair(argc, argv));
+  benchmark::RegisterBenchmark("test_by_hand", test_by_hand, std::make_pair(argc, argv));
+  benchmark::RegisterBenchmark("test_dune_power", test_dune_power, std::make_pair(argc, argv));
+  benchmark::Initialize(&argc, argv);
+  benchmark::RunSpecifiedBenchmarks();
+}
+```
+
+and a `CMakeLists.txt`
+
+```
+set(BENCHMARK_ENABLE_INSTALL false)
+set(BENCHMARK_ENABLE_TESTING false)
+set(DUNE_REENABLE_ADD_TEST true)
+
+include(FetchContent)
+FetchContent_Declare(googlebenchmark
+                     GIT_REPOSITORY https://github.com/google/benchmark
+        )
+FetchContent_MakeAvailable(googlebenchmark)
+
+add_executable(test_bench main.cc)
+target_link_libraries(test_bench benchmark::benchmark)
+```
+
+The option set via `set(...)` are important to make the benchmark work in the `dumux`/`dune` environment. This will automatically clone the latest version of the git repo to your local
+`build-cmake` folder. No other system changes will be made.
+
+Run the benchmark, using, e.g., vectors of size 1e6 with `./test_bench 1e6`.
+The benchmark will automatically chose a number of iterations in order to get reliable results.
+The more expensive an operation, the fewer iterations are usually required.
+
+```
+----------------------------------------------------------
+Benchmark                Time             CPU   Iterations
+----------------------------------------------------------
+test_std_pow      13911309 ns     13887442 ns           50
+test_by_hand        455876 ns       454306 ns         1520
+test_dune_power     458104 ns       456474 ns         1322
+```
+
+