This weeks exercise requires you to optimize matrix multiplication as well as mandelbrot set calculation. You're provided with implementations in matrix.cpp
and mandelbrot.cpp
.
Optimize the code in matrix.cpp
in terms of performance. Your task is to get as close as possible to our optimized code that we've committed as binaries (or even surpass it!).
Optimize the code in mandelbrot.cpp
in terms of performance. Your task is to get as close as possible to our optimized code that we've committed as binaries (or even surpass it!).
Deliver libmatrix.so
and libmandelbrot.so
in the project root directory exporting the symbols from matrix.h
and mandelbrot.h
.
- If you want to multi-thread your code, use a maximum of 4 (four) threads. You can expect all inputs of your code to be a multiple of 4 in size.
- If you want to use SIMD in your code, use only SSE and SSE2 intrinsics. You can filter the instructions sets on the "Intel Intrinsics Guide" (see references).
- We know that there are faster algorithms for matrix multiplication such as Strassen Algorithm. We do not want you to implement that. We want you to optimize the naive version given in
matrix.cpp
. The same goes for mandelbrot if you find a faster algorithm. - For mandelbrot, you're required to use the parameters given in
tests/mandelbrot_params.h
. Do not change them. - Regarding Rust, use the instrinsics from core::arch::x86_64 and not std::simd.