Giter Club home page Giter Club logo

mimalloc-bench's Introduction

Mimalloc-bench

 

Suite for benchmarking malloc implementations, originally developed for benchmarking mimalloc. Collection of various benchmarks from the academic literature, together with automated scripts to pull specific versions of benchmark programs and allocators from Github and build them.

Due to the large variance in programs and allocators, the suite is currently only developed for Unix-like systems, and specifically Ubuntu with apt-get, Fedora with dnf, and macOS (for a limited set of allocators and benchmarks). The only system-installed allocator used is glibc's implementation that ships as part of Linux's libc. All other allocators are downloaded and built as part of build-bench-env.sh -- if you are looking to run these benchmarks on a different Linux distribution look at the setup_packages function to see the packages required to build the full set of allocators.

It is quite easy to add new benchmarks and allocator implementations -- please do so!.

Enjoy, Daan

Note that all the code in the bench directory is not part of mimalloc-bench as such, and all programs in the bench directory are governed under their own specific licenses and copyrights as detailed in their README.md (or license.txt) files. They are just included here for convenience.

Benchmarking

The build-bench-env.sh script with the all argument will automatically pull all needed benchmarks and allocators and build them in the extern directory:

~/dev/mimalloc-bench> ./build-bench-env.sh all

It starts installing packages and you will need to enter the sudo password. All other programs are build in the mimalloc-bench/extern directory. Use ./build-bench-env.sh -h to see all options.

If everything succeeded, you can run the full benchmark suite (from out/bench) as:

  • ~/dev/mimalloc-bench> cd out/bench
  • ~/dev/mimalloc-bench/out/bench>../../bench.sh alla allt

Or just test mimalloc and tcmalloc on cfrac and larson with 16 threads:

  • ~/dev/mimalloc-bench/out/bench>../../bench.sh --procs=16 mi tc cfrac larson

Generally, you can specify the allocators (mi, je, tc, hd, sys (system allocator)) etc, and the benchmarks , cfrac, espresso, barnes, lean, larson, alloc-test, cscratch, etc. Or all allocators (alla) and tests (allt). Use --procs=<n> to set the concurrency, and use --help to see all supported allocators and benchmarks.

Current Allocators

Supported allocators are as follow, see build-bench-env.sh for the versions:

  • dieharder: The DieHarder allocator is an error-resistant memory allocator for Windows, Linux, and Mac OS X.
  • ff: ffmalloc, from the Usenix Security 21 paper
  • fg: The FreeGuard allocator, from the CCS 17 paper
  • gd: The Guarder allocator is a tunable secure allocator by the UTSA.
  • hd: The Hoard allocator by Emery Berger [1]. This is one of the first multi-thread scalable allocators.
  • hm: The Hardened Malloc from GrapheneOS, security-focused.
  • iso: The Isoalloc allocator, isolation-based aiming at providing a reasonable level of security without sacrificing too much the performances.
  • je: The jemalloc allocator by Jason Evans, now developed at Facebook and widely used in practice, for example in FreeBSD and Firefox.
  • lf: The lockfree-malloc allocator, multi-thread scalability-focused.
  • lp: The libpas allocator, used by WebKit.
  • lt: The ltalloc allocator, a multi-threaded memory allocator based on free lists best suited for many small allocations.
  • mesh: The mesh allocator, a memory allocator that automatically reduces the memory footprint of C/C++ applications. Also tested as nomesh with the meshing feature disabled.
  • mi: The mimalloc allocator. We can also test the debug version as dmi (this can be used to check for any bugs in the benchmarks), and the secure version as smi.
  • mng: musl's memory allocator.
  • pa: The PartitionAlloc allocator used in Chromium.
  • rp: The rpmalloc allocator uses 16-byte aligned allocations and is developed by Mattias Jansson at Epic Games, used for example in Haiku.
  • sc: The scalloc allocator, a fast, multicore-scalable, low-fragmentation memory allocator
  • scudo: The scudo allocator used by Fuschia and Android.
  • sg: The slimguard allocator, designed to be secure and memory-efficient.
  • sm: The Supermalloc allocator by Bradley Kuszmaul uses hardware transactional memory to speed up parallel operations.
  • sn: The snmalloc allocator is a recent concurrent message passing allocator by Liétar et al. [8].
  • tbb: The Intel TBB allocator that comes with the Thread Building Blocks (TBB) library [7].
  • tc: The tcmalloc allocator which comes as part of the Google performance tools, now maintained by the commuity.
  • tcg: The tcmalloc allocator, maintained and used by Google.
  • sys: The system allocator. Here we usually use the glibc allocator (which is originally based on Ptmalloc2).

Current Benchmarks

The first set of benchmarks are real world programs, or are trying to mimic some, and consists of:

  • barnes: a hierarchical n-body particle solver [4], simulating the gravitational forces between 163840 particles. It uses relatively few allocations compared to cfrac and espresso but is multithreaded.
  • cfrac: by Dave Barrett, implementation of continued fraction factorization, using many small short-lived allocations.
  • espresso: a programmable logic array analyzer, described by Grunwald, Zorn, and Henderson [3]. in the context of cache aware memory allocation.
  • gs: have ghostscript process the entire Intel Software Developer’s Manual PDF, which is around 5000 pages.
  • leanN: The Lean compiler by de Moura et al, version 3.4.1, compiling its own standard library concurrently using N threads (./lean --make -j N). Big real-world workload with intensive allocations.
  • redis: running redis-benchmark, with 1 million requests pushing 10 new list elements and then requesting the head 10 elements, and measures the requests handled per second. Simulates a real-world workload.
  • larsonN: by Larson and Krishnan [2]. Simulates a server workload using 100 separate threads which each allocate and free many objects but leave some objects to be freed by other threads. Larson and Krishnan observe this behavior (which they call bleeding) in actual server applications, and the benchmark simulates this.
  • larsonN-sized: same as the larsonN except it uses sized deallocation calls which have a fast path in some allocators.
  • lua: compiling the lua interpreter.
  • z3: perform some computations in z3.

The second set of benchmarks are stress tests and consist of:

  • alloc-test: a modern allocator test developed by OLogN Technologies AG (ITHare.com) Simulates intensive allocation workloads with a Pareto size distribution. The alloc-testN benchmark runs on N cores doing 100·10⁶ allocations per thread with objects up to 1KiB in size. Using commit 94f6cb (master, 2018-07-04)
  • cache-scratch: by Emery Berger [1]. Introduced with the Hoard allocator to test for passive-false sharing of cache lines: first some small objects are allocated and given to each thread; the threads free that object and allocate immediately another one, and access that repeatedly. If an allocator allocates objects from different threads close to each other this will lead to cache-line contention.
  • cache_trash: part of Hoard benchmarking suite, designed to exercise heap cache locality.
  • glibc-simple and glibc-thread: benchmarks for the glibc.
  • malloc-large: part of mimalloc benchmarking suite, designed to exercice large (several MiB) allocations.
  • mleak: check that terminate threads don't "leak" memory.
  • rptest: modified version of the rpmalloc-benchmark suite.
  • mstress: simulates real-world server-like allocation patterns, using N threads with with allocations in powers of 2
    where objects can migrate between threads and some have long life times. Not all threads have equal workloads and after each phase all threads are destroyed and new threads created where some objects survive between phases.
  • rbstress: modified version of allocator_bench, allocates chunks in memory via ruby shenanigans.
  • sh6bench: by MicroQuill as part of SmartHeap. Stress test where some of the objects are freed in a usual last-allocated, first-freed (LIFO) order, but others are freed in reverse order. Using the public source (retrieved 2019-01-02)
  • sh8benchN: by MicroQuill as part of SmartHeap. Stress test for multi-threaded allocation (with N threads) where, just as in larson, some objects are freed by other threads, and some objects freed in reverse (as in sh6bench). Using the public source (retrieved 2019-01-02)
  • xmalloc-testN: by Lever and Boreham [5] and Christian Eder. We use the updated version from the SuperMalloc repository. This is a more extreme version of the larson benchmark with 100 purely allocating threads, and 100 purely deallocating threads with objects of various sizes migrating between them. This asymmetric producer/consumer pattern is usually difficult to handle by allocators with thread-local caches.

Finally, there is a security benchmark aiming at checking basic security properties of allocators.

Example

Below is an example (Apr 2019) of the benchmark results on an HP Z4-G4 workstation with a 4-core Intel® Xeon® W2123 at 3.6 GHz with 16GB ECC memory, running Ubuntu 18.04.1 with LibC 2.27 and GCC 7.3.0.

bench-z4-1 bench-z4-2

Memory usage:

bench-z4-rss-1 bench-z4-rss-2

(note: the xmalloc-testN memory usage should be disregarded is it allocates more the faster the program runs. Unfortunately, there are no entries for SuperMalloc in the leanN and xmalloc-testN benchmarks as it faulted on those)

Results and notable usages

Improvements

Notable usages

References

  • [1] Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. Hoard: A Scalable Memory Allocator for Multithreaded Applications the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IX). Cambridge, MA, November 2000. pdf

  • [2] P. Larson and M. Krishnan. Memory allocation for long-running server applications. In ISMM, Vancouver, B.C., Canada, 1998. pdf

  • [3] D. Grunwald, B. Zorn, and R. Henderson. Improving the cache locality of memory allocation. In R. Cartwright, editor, Proceedings of the Conference on Programming Language Design and Implementation, pages 177–186, New York, NY, USA, June 1993. pdf

  • [4] J. Barnes and P. Hut. A hierarchical O(n*log(n)) force-calculation algorithm. Nature, 324:446-449, 1986.

  • [5] C. Lever, and D. Boreham. Malloc() Performance in a Multithreaded Linux Environment. In USENIX Annual Technical Conference, Freenix Session. San Diego, CA. Jun. 2000. Available at https://​github.​com/​kuszmaul/​SuperMalloc/​tree/​master/​tests

  • [6] Timothy Crundal. Reducing Active-False Sharing in TCMalloc. 2016. http://​courses.​cecs.​anu.​edu.​au/​courses/​CSPROJECTS/​16S1/​Reports/​Timothy*​Crundal*​Report.​pdf. CS16S1 project at the Australian National University.

  • [7] Alexey Kukanov, and Michael J Voss. The Foundations for Scalable Multi-Core Software in Intel Threading Building Blocks. Intel Technology Journal 11 (4). 2007

  • [8] Paul Liétar, Theodore Butler, Sylvan Clebsch, Sophia Drossopoulou, Juliana Franco, Matthew J Parkinson, Alex Shamis, Christoph M Wintersteiger, and David Chisnall. Snmalloc: A Message Passing Allocator. In Proceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management, 122–135. ACM. 2019.

mimalloc-bench's People

Contributors

1c3t3a avatar bpowers avatar cmovcc avatar daanx avatar dersteffi avatar jq-rs avatar jserv avatar jvoisin avatar luzhixing12345 avatar mjansson avatar mjp41 avatar mxms0 avatar r1km avatar tvaleev avatar wanghanchi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mimalloc-bench's Issues

Provide a way to bench a commit range of a specific allocator

It would be great to be able to tell mimalloc-bench "I want to bench this allocator, between this commit and this commit.", to find performances regressions. One might even go even further, and specify a difference threshold, to be able to use it in git bisect.

Hardcoded path for LD_PRELOAD

I can build part of mimalloc-bench on Linux/Aarch64 target. However, I found hardcoded path for LD_PRELOAD:

out/bench$ ../../bench.sh rp cfrac
...
---- cfrac

run cfrac rp: LD_PRELOAD=/tmp/mimalloc-bench/extern/rpmalloc/bin/linux/release/x86-64/librpmalloc.so ./cfrac 175451865205073170563711388363274837927895
ERROR: ld.so: object '/tmp/mimalloc-bench/extern/rpmalloc/bin/linux/release/x86-64/librpmalloc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

It should not be x86-64 since I am testing aarch64.

Provide Dockerfile or ready-made container on dockerhub

Hi!

Trying to build / bench mimalloc-bench inside a pristine ubuntu:20.04 container and I'm seeing the ./build-bench-env.sh packages lean is not providing enough standard packages to kickstart from a bare distribution like containerized ubuntu.

During the course of installation I had to install ~10 extra packages and yet I didn't manage to run bench properly. Perhaps a Dockerfile or a pre-built container on dockerhub could ease the process for those not wanting to "pollute" the main OS with lots of packages and external software.

new[] and delete in Larson.cpp

@daanx the Larson benchmarking does not correctly match new[] to delete[] calls. It allocates as an array, but deletes as an object: For instance allocation here

blkp[cblks] = new char[blk_size] ;

and deallocation here

delete blkp[victim] ;

This means that it is supplying size parameters to delete assuming it is deallocating a char rather than a char array. For snmalloc this was causing a lot of memory wastage as it was assuming the size was correct.

I am happy to PR a fix, but wanted to check if you would take it.

build-bench-env.sh fails to install alt allocators (and no bench.sh?)

Ubuntu Disco with more sys info below. The errors below all come from running the build-bench-env.sh file. Note also there is no bench.sh file in this repo and build-bench-env.sh has nothing to create a bench.sh.

I'd recommend checking out the instructions for each of the installers. Also it's generally bad form to install things for users or make directories outside of your repository. If you need to do both of these I would recommend zipping all of this into a docker container.

Hoard Compilation Errors

HEAD is now at c856b43 Fixed release target.
git clone https://github.com/emeryberger/Heap-Layers
Cloning into 'Heap-Layers'...
remote: Enumerating objects: 1710, done.
remote: Total 1710 (delta 0), reused 0 (delta 0), pack-reused 1710
Receiving objects: 100% (1710/1710), 402.50 KiB | 4.38 MiB/s, done.
Resolving deltas: 100% (1121/1121), done.
clang++ -std=c++14 -O3 -DNDEBUG -ffast-math -fno-builtin-malloc -Wall -Wextra -Wshadow -Wconversion -Wuninitialized -g -W -Wconversion -Wall -I/usr/include/nptl -fno-builtin-malloc -pipe -fPIC -DNDEBUG  -I. -Iinclude -Iinclude/util -Iinclude/hoard -Iinclude/superblocks -IHeap-Layers -D_REENTRANT=1 -shared   source/libhoard.cpp source/unixtls.cpp Heap-Layers/wrappers/gnuwrapper.cpp -Bsymbolic -o libhoard.so -ldl -lpthread
clang++ -std=c++14 -O3 -DNDEBUG -ffast-math -fno-builtin-malloc -Wall -Wextra -Wshadow -Wconversion -Wuninitialized -g -W -Wconversion -Wall -I/usr/include/nptl -fno-builtin-malloc -pipe -fPIC -DNDEBUG  -I. -Iinclude -Iinclude/util -Iinclude/hoard -Iinclude/superblocks -IHeap-Layers -D_REENTRANT=1 -shared   source/libhoard.cpp source/unixtls.cpp Heap-Layers/wrappers/gnuwrapper.cpp -Bsymbolic -o libhoard.so -ldl -lpthread
cp libhoard.so /usr/lib
cp: cannot stat '/usr/lib/libhoard.so': Too many levels of symbolic links
make: *** [GNUmakefile:187: Linux-gcc-x86_64-install] Error 1
~/open_source/open_projects/misc/mimalloc-bench

SuperMalloc

/usr/include/stdlib.h:583:14: note: in a call to allocation function ‘aligned_alloc’ declared here
 extern void *aligned_alloc (size_t __alignment, size_t __size)
              ^
lto1: all warnings being treated as errors
lto-wrapper: fatal error: g++ returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [../Makefile.include:122: ../release/aligned_alloc] Error 1
rm ../release/aligned_alloc.o
~/open_source/open_projects/misc/mimalloc-bench

rpmalloc

[10/48] CC test/thread.c
FAILED: build/ninja/linux/release/b'x86_64'/test-57ec084/thread-35aa063.o 
clang -MMD -MT 'build/ninja/linux/release/b'\''x86_64'\''/test-57ec084/thread-35aa063.o' -MF 'build/ninja/linux/release/b'\''x86_64'\''/test-57ec084/thread-35aa063.o'.d -I. -Irpmalloc -Itest -DRPMALLOC_COMPILE=1 -funit-at-a-time -fstrict-aliasing -fno-math-errno -ffinite-math-only -funsafe-math-optimizations -fno-trapping-math -ffast-math -D_GNU_SOURCE=1 -W -Werror -pedantic -Wall -Weverything -Wno-padded -Wno-documentation-unknown-command -std=c11  -DBUILD_RELEASE=1 -O3 -g -funroll-loops -DENABLE_ASSERTS=1 -DENABLE_STATISTICS=1 -c test/thread.c -o 'build/ninja/linux/release/b'\''x86_64'\''/test-57ec084/thread-35aa063.o'
test/thread.c:95:2: error: implicit use of sequentially-consistent atomic may incur stronger memory barriers than necessary [-Werror,-Watomic-implicit-seq-cst]
        __sync_synchronize();
        ^~~~~~~~~~~~~~~~~~
test/thread.c:104:2: error: implicit use of sequentially-consistent atomic may incur stronger memory barriers than necessary [-Werror,-Watomic-implicit-seq-cst]
        __sync_synchronize();
        ^~~~~~~~~~~~~~~~~~
2 errors generated.

Sys Info

uname -a
# Linux [comp] 5.0.0-15-generic #16-Ubuntu SMP 
# Mon May 6 17:41:33 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

g++ -v
# Using built-in specs.
# COLLECT_GCC=g++-9
# COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper
# OFFLOAD_TARGET_NAMES=nvptx-none
# OFFLOAD_TARGET_DEFAULT=1
# Target: x86_64-linux-gnu
# Configured with: ../src/configure -v --with-pkgversion='Ubuntu 9.1.0-2ubuntu2~19.04' --with-bugurl=file:///usr/share/doc/gcc-9/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-9 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
# Thread model: posix
# gcc version 9.1.0 (Ubuntu 9.1.0-2ubuntu2~19.04) 

cmake --version
# cmake version 3.13.4

Use native compilation for allocators

As discussed today, there are pros and cons about using native compilation for allocators (like in #54), but mostly pros, boiling down to allocators taking advantage of modern CPU features shouldn't be penalized by using the lowest common denominator.

Since not every build system/scripts used have a --enable-native-and-all-perf-stuff-go-go-go option, one way to make use of --march-native would be to simply pass it via the CFLAGS and CXXFLAGS environment variables.

Improve the speed of the benchmarks

Currently the CI takes around 90min to run: this is a bit excessive. Here is what we could do to reduce a bit this number:

  • Change the parameters for the redis benchmark: SYSMALLOC=1 /__w/mimalloc-bench/mimalloc-bench/extern/redis-6.2.6/src/redis-benchmark -r 1000000 -n 1000000 -q -P 16 lpush a 1 2 3 4 5 lrange a 1 5 takes around a full minute, given that we have ~16 allocators, it adds up rapidly.
  • Lean takes around 1-3 minutes per allocator, and is unmaintained. Shall we try to move to lean4 ?

Rename the repository

It would be great to rename this repository since:

  • it's not only about mimalloc anymore
  • it's confusing
  • it makes people suspicious of a bias toward mimalloc

Something like mbench, malloc-bench, …

Don't use sudo when ran as root

When running in a throwaway docker environment, most people will not care about creating a dedicated user, as highlighted in #117, so we might want to drop the usage of sudo in this case. And print a big fat red warning about running stuff as root of course :)

documentation: Chrome does not use tcmalloc

Chrome doesn’t use tcmalloc and hasn’t for a long time. They’re using PartitionAlloc on all platforms they’re targeting and have already removed the relevant code from //base/allocator.

Document the benchmarks

It would be nice to have, for each benchmark:

  • its name

  • who wrote it and when

  • why is it used in mimalloc-bench

  • what is it testing

  • is it emulating a real-world software/pattern

  • how to use it as part of the bench suite (like, what flag to pass)

  • cfrac

  • espresso

  • barnes

  • redis

  • lean

  • larson

  • larson-sized

  • mstress

  • rptest

  • alloc-test

  • sh6bench

  • sh8bench

  • xmalloc-test

  • cscratch

  • glibc-simple

  • glibc-thread

  • lean-mathlib

  • gs

  • z3

  • spec

  • spec-bench

  • malloc-large

  • mleak

  • malloc-test

  • cthrash

  • rbstress

This is also a nice opportunity to assess their relevance, and get rid of the unneeded ones.

Mention which allocators need system-wide install

For those not on Debian/Ubuntu it would be very helpful to know which allocators need to be installed system-wide and which are fetched by the build script.

Even better would be to split that script into a local build script that work on any Linux and isolate the parts that assumes Debian/Ubuntu. It would be easier for others to then add the scripts for Arch, Fedora, Gentoo, ...

Race condition in the rptest benchmark

Running isoalloc with the rptest benchmark resulted in a reproducible segfault, hinting at a race condition in rptest, as found by @struct:

root@67dbecee929b:/# clang -o rptest *.c -lpthread -I. -ggdb -fsanitize=thread
root@67dbecee929b:/# ./rptest 2 0 1 2 500 1000 100 8 16000
crt          2 threads random linear size [8,16000] 500 loops 1000 allocs 100 ops: ==================
WARNING: ThreadSanitizer: data race (pid=17173)
  Write of size 4 at 0x000000fec030 by main thread:
    #0 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:777:19 (rptest+0x4b51b4)
    #1 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Previous read of size 4 at 0x000000fec030 by thread T1:
    #0 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:355:10 (rptest+0x4b5a34)
    #1 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  As if synchronized via sleep:
    #0 nanosleep <null> (rptest+0x4215d8)
    #1 thread_sleep /code/jmmb/mimalloc-bench/bench/rptest/thread.c:78:2 (rptest+0x4b80f4)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:775:3 (rptest+0x4b51a0)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Location is global 'benchmark_start' of size 4 at 0x000000fec030 (rptest+0x000000fec030)

  Thread T1 (tid=17175, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

SUMMARY: ThreadSanitizer: data race /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:777:19 in benchmark_run
==================
==================
WARNING: ThreadSanitizer: data race (pid=17173)
  Read of size 8 at 0xfffff5503f68 by thread T1:
    #0 atomic_load_ptr /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:23:14 (rptest+0x4b7e10)
    #1 put_cross_thread_memory /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:312:10 (rptest+0x4b7d14)
    #2 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:469:5 (rptest+0x4b6d1c)
    #3 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  Previous write of size 8 at 0xfffff5503f68 by main thread:
    #0 atomic_store_ptr /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:28:17 (rptest+0x4b79d4)
    #1 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:767:4 (rptest+0x4b4ff4)
    #2 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  As if synchronized via sleep:
    #0 nanosleep <null> (rptest+0x4215d8)
    #1 thread_sleep /code/jmmb/mimalloc-bench/bench/rptest/thread.c:78:2 (rptest+0x4b80f4)
    #2 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:356:3 (rptest+0x4b5a4c)
    #3 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  Location is heap block of size 272 at 0xfffff5503e80 allocated by main thread:
    #0 calloc <null> (rptest+0x4228b4)
    #1 benchmark_malloc /code/jmmb/mimalloc-bench/bench/rptest/./benchmark.h:41:13 (rptest+0x4b3c5c)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:723:8 (rptest+0x4b49a8)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Thread T1 (tid=17175, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

SUMMARY: ThreadSanitizer: data race /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:23:14 in atomic_load_ptr
==================
==================
WARNING: ThreadSanitizer: data race (pid=17173)
  Atomic write of size 8 at 0xfffff5503ee0 by thread T2:
    #0 __tsan_atomic64_compare_exchange_val <null> (rptest+0x4790d4)
    #1 atomic_cas_ptr /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:77:9 (rptest+0x4b7e8c)
    #2 put_cross_thread_memory /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:315:12 (rptest+0x4b7d7c)
    #3 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:469:5 (rptest+0x4b6d1c)
    #4 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  Previous read of size 8 at 0xfffff5503ee0 by thread T1:
    #0 atomic_load_ptr /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:23:14 (rptest+0x4b7e10)
    #1 get_cross_thread_memory /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:323:10 (rptest+0x4b7c50)
    #2 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:368:14 (rptest+0x4b5b88)
    #3 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  Location is heap block of size 272 at 0xfffff5503e80 allocated by main thread:
    #0 calloc <null> (rptest+0x4228b4)
    #1 benchmark_malloc /code/jmmb/mimalloc-bench/bench/rptest/./benchmark.h:41:13 (rptest+0x4b3c5c)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:723:8 (rptest+0x4b49a8)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Thread T2 (tid=17176, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Thread T1 (tid=17175, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

SUMMARY: ThreadSanitizer: data race (/code/jmmb/mimalloc-bench/bench/rptest/rptest+0x4790d4) in __tsan_atomic64_compare_exchange_val
==================
==================
WARNING: ThreadSanitizer: data race (pid=17173)
  Read of size 4 at 0x000000fec034 by thread T2:
    #0 atomic_load32 /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:38:14 (rptest+0x4b7a20)
    #1 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:473:8 (rptest+0x4b6d3c)
    #2 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  Previous atomic write of size 4 at 0x000000fec034 by thread T1:
    #0 __tsan_atomic32_fetch_add <null> (rptest+0x4735a0)
    #1 atomic_incr32 /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:52:9 (rptest+0x4b7dc4)
    #2 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:511:3 (rptest+0x4b6d9c)
    #3 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  Location is global 'benchmark_threads_sync' of size 4 at 0x000000fec034 (rptest+0x000000fec034)

  Thread T2 (tid=17176, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Thread T1 (tid=17175, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

SUMMARY: ThreadSanitizer: data race /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:38:14 in atomic_load32
==================
==================
WARNING: ThreadSanitizer: data race (pid=17173)
  Read of size 4 at 0xfffff5503ee8 by main thread:
    #0 atomic_load32 /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:38:14 (rptest+0x4b7a20)
    #1 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:789:38 (rptest+0x4b5258)
    #2 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Previous atomic write of size 4 at 0xfffff5503ee8 by thread T2:
    #0 __tsan_atomic32_fetch_add <null> (rptest+0x4735a0)
    #1 atomic_add32 /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:62:9 (rptest+0x4b7c04)
    #2 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:398:5 (rptest+0x4b601c)
    #3 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  As if synchronized via sleep:
    #0 nanosleep <null> (rptest+0x4215d8)
    #1 thread_sleep /code/jmmb/mimalloc-bench/bench/rptest/thread.c:78:2 (rptest+0x4b80f4)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:784:3 (rptest+0x4b51fc)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Location is heap block of size 272 at 0xfffff5503e80 allocated by main thread:
    #0 calloc <null> (rptest+0x4228b4)
    #1 benchmark_malloc /code/jmmb/mimalloc-bench/bench/rptest/./benchmark.h:41:13 (rptest+0x4b3c5c)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:723:8 (rptest+0x4b49a8)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Thread T2 (tid=17176, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

SUMMARY: ThreadSanitizer: data race /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:38:14 in atomic_load32
==================
==================
WARNING: ThreadSanitizer: data race (pid=17173)
  Read of size 4 at 0xfffff5503f70 by main thread:
    #0 atomic_load32 /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:38:14 (rptest+0x4b7a20)
    #1 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:789:38 (rptest+0x4b5258)
    #2 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Previous atomic write of size 4 at 0xfffff5503f70 by thread T2:
    #0 __tsan_atomic32_fetch_add <null> (rptest+0x4735a0)
    #1 atomic_add32 /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:62:9 (rptest+0x4b7c04)
    #2 benchmark_worker /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:460:32 (rptest+0x4b6b54)
    #3 thread_entry /code/jmmb/mimalloc-bench/bench/rptest/thread.c:29:2 (rptest+0x4b7fc4)

  As if synchronized via sleep:
    #0 nanosleep <null> (rptest+0x4215d8)
    #1 thread_sleep /code/jmmb/mimalloc-bench/bench/rptest/thread.c:78:2 (rptest+0x4b80f4)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:784:3 (rptest+0x4b51fc)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Location is heap block of size 272 at 0xfffff5503e80 allocated by main thread:
    #0 calloc <null> (rptest+0x4228b4)
    #1 benchmark_malloc /code/jmmb/mimalloc-bench/bench/rptest/./benchmark.h:41:13 (rptest+0x4b3c5c)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:723:8 (rptest+0x4b49a8)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

  Thread T2 (tid=17176, running) created by main thread at:
    #0 pthread_create <null> (rptest+0x4241b4)
    #1 thread_run /code/jmmb/mimalloc-bench/bench/rptest/thread.c:41:12 (rptest+0x4b7f14)
    #2 benchmark_run /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:772:29 (rptest+0x4b5128)
    #3 main /code/jmmb/mimalloc-bench/bench/rptest/rptest.c:901:9 (rptest+0x4b7ba8)

SUMMARY: ThreadSanitizer: data race /code/jmmb/mimalloc-bench/bench/rptest/./atomic.h:38:14 in atomic_load32
==================
........756290 memory ops/CPU second (59MiB peak, 3MiB -> 58MiB bytes sample, 1375% overhead)
ThreadSanitizer: reported 6 warnings

Please consider adding scudo

Please consider adding the scudo allocator. It's available in the compiler-rt/lib/scudo/standalone directory of https://github.com/llvm/llvm-project/ , and from that directory, you can build a shared library using clang++ -flto -fuse-ld=lld -fPIC -std=c++14 -fno-exceptions -fno-rtti -fvisibility=internal -msse4.2 -O3 -I include -shared -o libscudo.so *.cpp -pthread

Run the CI in musl

It would be nice to run the CI in an alpine linux container, to see how allocators are behaving on musl

Add musl's malloc-ng

The source can be found here, and shouldn't be hard to integrate. I'll likely do it soon™ if nobody does it before.

Musl's malloc-ng is musl default memory allocator, and is thus used in Alpine Linux by default, and on some versions of Void, Gentoo, OpenWRT, …

Cc: @richfelker

Add a complex benchmark

Some allocators are doing tradeoffs to perform very well on long-running complex workloads, so we should have some benchmarks to measure this. Maybe something like firefox running on a couple of webpages, or Apache?

Or maybe some lightweight javascript engine, running some javascript things.
@TheShiftedBit suggested to compile nodejs or python to use the memory allocator, and run some workload on it.

sc fails to build on current compilers

Trying to run build-bench-env.sh on a system with GCC 10.2.0 resulted in this error from sc:

--------------------------------------------
build sc: version master
--------------------------------------------

/tmp/mimalloc-bench/extern /tmp/mimalloc-bench
Cloning into 'scalloc'...
remote: Enumerating objects: 3308, done.
remote: Total 3308 (delta 0), reused 0 (delta 0), pack-reused 3308
Receiving objects: 100% (3308/3308), 1.03 MiB | 4.95 MiB/s, done.
Resolving deltas: 100% (2261/2261), done.
Already on 'master'
Your branch is up to date with 'origin/master'.
updating dependencies for scalloc

gyp... 

Cloning into 'build/gyp'...
remote: Total 18491 (delta 12809), reused 18491 (delta 12809)
Receiving objects: 100% (18491/18491), 8.60 MiB | 5.14 MiB/s, done.
Resolving deltas: 100% (12809/12809), done.

gyp... done
  CXX(target) out/Release/obj.target/scalloc/src/glue.o
In file included from src/platform/assert.h:8,
                 from src/arena.h:12,
                 from src/glue.h:12,
                 from src/glue.cc:5:
src/log.h: In function ‘void LogPrintf(int, const char*, const char*, ...)’:
src/log.h:51:10: error: ‘char* strncat(char*, const char*, size_t)’ specified bound 1 equals source length [-Werror=stringop-overflow=]
   51 |   strncat(buffer, ":", 1);
      |   ~~~~~~~^~~~~~~~~~~~~~~~
src/log.h:55:10: error: ‘char* strncat(char*, const char*, size_t)’ specified bound 1 equals source length [-Werror=stringop-overflow=]
   55 |   strncat(buffer, " ", 1);
      |   ~~~~~~~^~~~~~~~~~~~~~~~
src/log.h:75:10: error: ‘char* strncat(char*, const char*, size_t)’ specified bound 1 equals source length [-Werror=stringop-overflow=]
   75 |   strncat(buffer, "\n", 1);
      |   ~~~~~~~^~~~~~~~~~~~~~~~~
src/log.h:70:12: error: ‘char* strncpy(char*, const char*, size_t)’ output truncated before terminating nul copying 3 bytes from a string of the same length [-Werror=stringop-truncation]
   70 |     strncpy(rest_start + (rest - strlen(truncate_suffix)),
      |     ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   71 |             truncate_suffix,
      |             ~~~~~~~~~~~~~~~~
   72 |             strlen(truncate_suffix));
      |             ~~~~~~~~~~~~~~~~~~~~~~~~
src/log.h:49:10: error: ‘char* strncat(char*, const char*, size_t)’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
   49 |   strncat(buffer, file, strlen(file));
      |   ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
src/log.h:53:10: error: ‘char* strncat(char*, const char*, size_t)’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
   53 |   strncat(buffer, line_buffer, strlen(line_buffer));
      |   ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make: *** [scalloc.target.mk:118: out/Release/obj.target/scalloc/src/glue.o] Error 1

Do we want to make the benchmark portable, shell-wise?

Currently, the ./build-bench-env.sh and ./bench.sh scripts do contain a lot of bash-specific constructs.
Do we want to make it work on other shells like sh? Eg. Alpine Linux doesn't use bash by default, but busybox' ash.
Do we even care?

redis build fails with multiple definition of SDS_NOINIT

/usr/bin/ld: server.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: sds.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: ziplist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: networking.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: util.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: object.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: db.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: replication.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: rdb.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: t_string.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: t_list.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: t_set.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: t_zset.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: t_hash.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: config.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: aof.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: pubsub.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: multi.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: debug.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: sort.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: syncio.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: cluster.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: crc16.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: slowlog.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: scripting.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: bio.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: rio.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: bitops.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: sentinel.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: notify.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: blocked.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: hyperloglog.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: latency.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: sparkline.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: redis-check-rdb.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: redis-check-aof.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: geo.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: lazyfree.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: module.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: evict.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: expire.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: childinfo.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: defrag.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: t_stream.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: lolwut.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here
/usr/bin/ld: lolwut5.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: multiple definition of `SDS_NOINIT'; quicklist.o:/tmp/mimalloc-bench/extern/redis-5.0.3/src/sds.h:37: first defined here

This is a bug in that version of redis when building with gcc 10:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=957751

alloc-test uses incorrect types

alloc-test uses size_t to store time intervals, and performs arithmetic with large multiplications on them without promoting to a larger type. This causes integer overflows and invalid results on 32-bit targets that aren't extremely fast.

Some other tests may have similar issues; I haven't checked yet.

hd fails to build unless llvm-dev is installed first

--------------------------------------------
build hd: version 9d137ef37
--------------------------------------------

/tmp/mimalloc-bench/extern /tmp/mimalloc-bench
/tmp/mimalloc-bench/extern/Hoard already exists; no need to git clone
HEAD is now at 9d137ef Updated TLS model.
clang++ -std=c++14 -flto -O3 -DNDEBUG -ffast-math -fno-builtin-malloc -Wall -Wextra -Wshadow -Wconversion -Wuninitialized  -g -W -Wconversion -Wall -I/usr/include/nptl -fno-builtin-malloc -pipe -fPIC -DNDEBUG  -I. -Iinclude -Iinclude/util -Iinclude/hoard -Iinclude/superblocks -IHeap-Layers -D_REENTRANT=1 -shared   source/libhoard.cpp source/unixtls.cpp -Bsymbolic -o libhoard.so -ldl -lpthread
/usr/bin/ld: /usr/lib/llvm-9/bin/../lib/LLVMgold.so: error loading plugin: /usr/lib/llvm-9/bin/../lib/LLVMgold.so: cannot open shared object file: No such file or directory
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [GNUmakefile:200: Linux-gcc-unknown] Error 1

Linking with gold requires having the llvm-dev package installed, at least on Debian. Please consider adding that to the script's list of packages to install.

Running binaries with tbb fails to find libtbbmalloc.so.2

run cfrac tbb: LD_PRELOAD=/tmp/mimalloc-bench/extern/tbb/build/linux_intel64_gcc_cc10.2.0_libc2.31_kernel5.7.0_release/libtbbmalloc_proxy.so.2 ./cfrac 17545186520507317056371138836327483792789528
./cfrac: error while loading shared libraries: libtbbmalloc.so.2: cannot open shared object file: No such file or directory

Result of print

What does each number that prints when you run the benchmark mean?

Such as
cfrac mi 08.31 3672 8.31 0.00 0 362

larsonN mi 1.118 3769812 279.90 3.95 0 950760

Missing debian packages

Script must install curl,automake and bsdextrautils on debian. Running from debian official docker images fails

Use full names for allocators

Currently, it's a bit confusing what the two-letter codes are referring to: what are the differences between mi, xdmi, xmi, dmi, smi, xsmi, … also, what is sc compared to scudo? What is tlsf ? And so on.

I would have been happy to do it myself, but for some allocators, I actually have no clue what they're referring to :D

Add glibc's malloc tests

Please add both of glibc's malloc tests for all of the allocators:

  1. glibc/benchtests/bench-malloc-simple.c
  2. glibc/benchtests/bench-malloc-thread.c

Get them here: https://www.gnu.org/software/libc/sources.html:

git clone https://sourceware.org/git/glibc.git
cd glibc
git checkout master

Mirror: https://github.com/bminor/glibc/tree/master/benchtests

See also:

  1. https://kazoo.ga/a-simple-tool-to-test-malloc-performance/
  2. https://github.com/f18m/malloc-benchmarks/tree/master/benchmark-src

Build fails on Ubuntu 20.04

Running ./build-bench-env.sh all fails in the section build sm. I am running Ubuntu 20.04 (in WSL2, though I don't think that's relevant), with g++ 10.3.0. I will provide additional information, if necessary.

--------------------------------------------
build sm: version 709663f
--------------------------------------------

~/allocator-test/mimalloc-bench/extern ~/allocator-test/mimalloc-bench
Cloning into 'SuperMalloc'...
remote: Enumerating objects: 2756, done.
remote: Total 2756 (delta 0), reused 0 (delta 0), pack-reused 2756
Receiving objects: 100% (2756/2756), 5.07 MiB | 1.61 MiB/s, done.
Resolving deltas: 100% (1857/1857), done.
Note: switching to '709663f'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 709663f Merge pull request #48 from Willtor/master
set -e; rm -f ../release/env.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/env.cc -MG -MF ../release/env.d.$$; \
              sed 's,\(env\)\.o[ :]*,../release/\1.o ../release/env.d : ,g' < ../release/env.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/env.d; \
              rm -f ../release/env.d.$$
set -e; rm -f ../release/has_tsx.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/has_tsx.cc -MG -MF ../release/has_tsx.d.$$; \
              sed 's,\(has_tsx\)\.o[ :]*,../release/\1.o ../release/has_tsx.d : ,g' < ../release/has_tsx.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/has_tsx.d; \
              rm -f ../release/has_tsx.d.$$
set -e; rm -f ../release/futex_mutex.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/futex_mutex.cc -MG -MF ../release/futex_mutex.d.$$; \
              sed 's,\(futex_mutex\)\.o[ :]*,../release/\1.o ../release/futex_mutex.d : ,g' < ../release/futex_mutex.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/futex_mutex.d; \
              rm -f ../release/futex_mutex.d.$$
set -e; rm -f ../release/stats.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/stats.cc -MG -MF ../release/stats.d.$$; \
              sed 's,\(stats\)\.o[ :]*,../release/\1.o ../release/stats.d : ,g' < ../release/stats.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/stats.d; \
              rm -f ../release/stats.d.$$
set -e; rm -f ../release/footprint.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/footprint.cc -MG -MF ../release/footprint.d.$$; \
              sed 's,\(footprint\)\.o[ :]*,../release/\1.o ../release/footprint.d : ,g' < ../release/footprint.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/footprint.d; \
              rm -f ../release/footprint.d.$$
set -e; rm -f ../release/bassert.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/bassert.cc -MG -MF ../release/bassert.d.$$; \
              sed 's,\(bassert\)\.o[ :]*,../release/\1.o ../release/bassert.d : ,g' < ../release/bassert.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/bassert.d; \
              rm -f ../release/bassert.d.$$
set -e; rm -f ../release/cache.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/cache.cc -MG -MF ../release/cache.d.$$; \
              sed 's,\(cache\)\.o[ :]*,../release/\1.o ../release/cache.d : ,g' < ../release/cache.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/cache.d; \
              rm -f ../release/cache.d.$$
set -e; rm -f ../release/small_malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/small_malloc.cc -MG -MF ../release/small_malloc.d.$$; \
              sed 's,\(small_malloc\)\.o[ :]*,../release/\1.o ../release/small_malloc.d : ,g' < ../release/small_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/small_malloc.d; \
              rm -f ../release/small_malloc.d.$$
set -e; rm -f ../release/large_malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/large_malloc.cc -MG -MF ../release/large_malloc.d.$$; \
              sed 's,\(large_malloc\)\.o[ :]*,../release/\1.o ../release/large_malloc.d : ,g' < ../release/large_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/large_malloc.d; \
              rm -f ../release/large_malloc.d.$$
set -e; rm -f ../release/huge_malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/huge_malloc.cc -MG -MF ../release/huge_malloc.d.$$; \
              sed 's,\(huge_malloc\)\.o[ :]*,../release/\1.o ../release/huge_malloc.d : ,g' < ../release/huge_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/huge_malloc.d; \
              rm -f ../release/huge_malloc.d.$$
set -e; rm -f ../release/rng.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/rng.cc -MG -MF ../release/rng.d.$$; \
              sed 's,\(rng\)\.o[ :]*,../release/\1.o ../release/rng.d : ,g' < ../release/rng.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/rng.d; \
              rm -f ../release/rng.d.$$
set -e; rm -f ../release/makechunk.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/makechunk.cc -MG -MF ../release/makechunk.d.$$; \
              sed 's,\(makechunk\)\.o[ :]*,../release/\1.o ../release/makechunk.d : ,g' < ../release/makechunk.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/makechunk.d; \
              rm -f ../release/makechunk.d.$$
set -e; rm -f ../release/malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/malloc.cc -MG -MF ../release/malloc.d.$$; \
              sed 's,\(malloc\)\.o[ :]*,../release/\1.o ../release/malloc.d : ,g' < ../release/malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/malloc.d; \
              rm -f ../release/malloc.d.$$
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -I../src  -c ../src/bassert.cc -o ../release/bassert.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11  ../src/objsizes.cc ../release/bassert.o -o ../release/objsizes
./../release/objsizes  ../release/generated_constants.cc >  ../release/generated_constants.h
set -e; rm -f ../release/cache.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/cache.cc -MG -MF ../release/cache.d.$$; \
              sed 's,\(cache\)\.o[ :]*,../release/\1.o ../release/cache.d : ,g' < ../release/cache.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/cache.d; \
              rm -f ../release/cache.d.$$
set -e; rm -f ../release/small_malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/small_malloc.cc -MG -MF ../release/small_malloc.d.$$; \
              sed 's,\(small_malloc\)\.o[ :]*,../release/\1.o ../release/small_malloc.d : ,g' < ../release/small_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/small_malloc.d; \
              rm -f ../release/small_malloc.d.$$
set -e; rm -f ../release/large_malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/large_malloc.cc -MG -MF ../release/large_malloc.d.$$; \
              sed 's,\(large_malloc\)\.o[ :]*,../release/\1.o ../release/large_malloc.d : ,g' < ../release/large_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/large_malloc.d; \
              rm -f ../release/large_malloc.d.$$
set -e; rm -f ../release/huge_malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/huge_malloc.cc -MG -MF ../release/huge_malloc.d.$$; \
              sed 's,\(huge_malloc\)\.o[ :]*,../release/\1.o ../release/huge_malloc.d : ,g' < ../release/huge_malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/huge_malloc.d; \
              rm -f ../release/huge_malloc.d.$$
set -e; rm -f ../release/makechunk.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/makechunk.cc -MG -MF ../release/makechunk.d.$$; \
              sed 's,\(makechunk\)\.o[ :]*,../release/\1.o ../release/makechunk.d : ,g' < ../release/makechunk.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/makechunk.d; \
              rm -f ../release/makechunk.d.$$
set -e; rm -f ../release/malloc.d; \
              g++ -MM -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  ../src/malloc.cc -MG -MF ../release/malloc.d.$$; \
              sed 's,\(malloc\)\.o[ :]*,../release/\1.o ../release/malloc.d : ,g' < ../release/malloc.d.$$ \
               | sed 's,generated_constants.h,../release/generated_constants.h,' > ../release/malloc.d; \
              rm -f ../release/malloc.d.$$
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/malloc.cc -o ../release/malloc.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/makechunk.cc -o ../release/makechunk.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/rng.cc -o ../release/rng.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/huge_malloc.cc -o ../release/huge_malloc.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/large_malloc.cc -o ../release/large_malloc.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/small_malloc.cc -o ../release/small_malloc.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/cache.cc -o ../release/cache.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/footprint.cc -o ../release/footprint.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/stats.cc -o ../release/stats.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/futex_mutex.cc -o ../release/futex_mutex.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src   -c -o ../release/generated_constants.o ../release/generated_constants.cc
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/has_tsx.cc -o ../release/has_tsx.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11     -I../release   -I../src  -c ../src/env.cc -o ../release/env.o
mkdir -p ../release/lib
gcc-ar cr ../release/lib/supermalloc.a  ../release/malloc.o  ../release/makechunk.o  ../release/rng.o  ../release/huge_malloc.o  ../release/large_malloc.o  ../release/small_malloc.o  ../release/cache.o  ../release/bassert.o  ../release/footprint.o  ../release/stats.o  ../release/futex_mutex.o  ../release/generated_constants.o  ../release/has_tsx.o  ../release/env.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11   ../release/malloc.o  ../release/makechunk.o  ../release/rng.o  ../release/huge_malloc.o  ../release/large_malloc.o  ../release/small_malloc.o  ../release/cache.o  ../release/bassert.o  ../release/footprint.o  ../release/stats.o  ../release/futex_mutex.o  ../release/generated_constants.o  ../release/has_tsx.o  ../release/env.o -shared -ldl -o ../release/lib/libsupermalloc.so
cc -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c11    -I../release   -I../src  -c ../tests/aligned_alloc.c -o ../release/aligned_alloc.o
g++ -W -Wall  -O3 -flto -ggdb -pthread -fPIC -mrtm  -std=c++11  ../release/aligned_alloc.o -ldl -L../release/lib -Wl,-rpath,../release/lib -ldl ../release/lib/libsupermalloc.so -o ../release/aligned_alloc
lto1: fatal error: bytecode stream in file ‘../release/aligned_alloc.o’ generated with GCC compiler older than 10.0
compilation terminated.
lto-wrapper: fatal error: g++ returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [../Makefile.include:122: ../release/aligned_alloc] Error 1
rm ../release/aligned_alloc.o

Add the new TCMalloc

Unfortunately, there are two allocators both called tcmalloc. You are using the one from the gperftools repo, but a newer fork from upstream which uses restartable sequences is available at https://github.com/google/tcmalloc. It would be nice to be able to see results for both of them along with mimalloc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.