Giter Club home page Giter Club logo

coremark-pro's Introduction

About

CoreMark®-PRO is a comprehensive, advanced processor benchmark that works with and enhances the market-proven industry-standard EEMBC CoreMark® benchmark. While CoreMark stresses the CPU pipeline, CoreMark-PRO tests the entire processor, adding comprehensive support for multicore technology, a combination of integer and floating-point workloads, and data sets for utilizing larger memory subsystems. Together, EEMBC CoreMark and CoreMark-PRO provide a standard benchmark covering the spectrum from low-end microcontrollers to high-performance computing processors.

The EEMBC CoreMark-PRO benchmark contains five prevalent integer workloads and four popular floating-point workloads.

The integer workloads include:

  • JPEG compression
  • ZIP compression
  • XML parsing
  • SHA-256 Secure Hash Algorithm
  • A more memory-intensive version of the original CoreMark

The floating-point workloads include:

  • Radix-2 Fast Fourier Transform (FFT)
  • Gaussian elimination with partial pivoting derived from LINPACK
  • A simple neural-net
  • A greatly improved version of the Livermore loops benchmark using the following 24 FORTRAN kernels converted to C (all of these reported as a single score of the loops.c workload). The standard Livermore loops include:
    • Kernel 1 -- hydro fragment
    • Kernel 2 -- ICCG excerpt (Incomplete Cholesky Conjugate Gradient)
    • Kernel 3 -- inner product
    • Kernel 4 -- banded linear equations
    • Kernel 5 -- tri-diagonal elimination, below diagonal
    • Kernel 6 -- general linear recurrence equations
    • Kernel 7 -- equation of state fragment
    • Kernel 8 -- ADI integration
    • Kernel 9 -- integrate predictors
    • Kernel 10 -- difference predictors
    • Kernel 11 -- first sum
    • Kernel 12 -- first difference
    • Kernel 13 -- 2-D PIC (Particle In Cell)
    • Kernel 14 -- 1-D PIC (pticle In Cell)
    • Kernel 15 -- Casual Fortran.
    • Kernel 16 -- Monte Carlo search loop
    • Kernel 17 -- implicit, conditional computation
    • Kernel 18 -- 2-D explicit hydrodynamics fragment
    • Kernel 19 -- general linear recurrence equations
    • Kernel 20 -- Discrete ordinates transport, conditional recurrence on xx
    • Kernel 21 -- matrix*matrix product
    • Kernel 22 -- Planckian distribution
    • Kernel 23 -- 2-D implicit hydrodynamics fragment
    • Kernel 24 -- find location of first minimum in array

The CoreMark-PRO score is a weighted geometric mean of each workload, as describe on page 12 of the provided PDF document.

Basic Overview

Build the benchmark using the make command and specificying a target architecture with TARGET=. Accomodations for custom targets and toolchains are placed in the util/make folder. To compile for Linux and the gcc64 toolchain, use this command:

% make TARGET=linux64 build

This will include the util/make/linux64.mak file, which in turn includes the gcc64.mak file for the toolchain. When finished, nine executables are saved in builds/linux64/gcc64/bin folder. These are binaries used by the test.

The command:

% make TARGET=linux64 XCMD='-c4' certify-all

...runs all of the nine tests (with four contexts), collects their output scores, and processes them through a Perl script to generate the final CoreMark-PRO score, like so:

WORKLOAD RESULTS TABLE

                                                 MultiCore SingleCore           
Workload Name                                     (iter/s)   (iter/s)    Scaling
----------------------------------------------- ---------- ---------- ----------
cjpeg-rose7-preset                                  555.56     156.25       3.56
core                                                  4.87       1.30       3.75
linear_alg-mid-100x100-sp                          1428.57     409.84       3.49
loops-all-mid-10k-sp                                 22.56       6.25       3.61
nnet_test                                            33.22      10.56       3.15
parser-125k                                          70.18      19.23       3.65
radix2-big-64k                                     1666.67     453.72       3.67
sha-test                                            588.24     172.41       3.41
zip-test                                            500.00     142.86       3.50

MARK RESULTS TABLE

Mark Name                                        MultiCore SingleCore    Scaling
----------------------------------------------- ---------- ---------- ----------
CoreMark-PRO                                      19183.84    5439.59       3.53

This will run all nine tests twice, once with one context and once with a user-defined number of contexts, in this case four, and then generate the scaling between the two configurations. Please refer to the documentation for explanations of how to change the number of contexts and workers.

Source Code Overview

MITH Porting

The benchmark utilizes EEMBC's Multi-Instance Test Harness, or MITH. Found in the mith folder, the test harness consists of high-level functions for launching the tests, and a low-level abstraction layer (in the al folder) for interfacing with the hardware or operating system. The file th_al.c in the al/src folder is the only place modifications are needed to port the benchmark to new hardware. In fact, changing any other source files invalidates the CoreMark-PRO score.

Out of the box, the MITH abstraction layer is configured to work with the POSIX pthread architecture on Linux, but any thread scheduling system that can be represented through the MITH abstraction layer is valid (including no threading on baremetal). The MITH harness provieds a mith_main function, and the actual main functions are provided in the workload areas.

The example above was run from a Linux CLI, where it is possible to invoke each binary in simple succession via the Makefile and collect scores for analysis by the Perl script. Non-Linux targets (e.g., baremetal) are more complex to run, as each binary needs to be downloaded to the hardware manually and the individual results collected from a remote debugger console by retargeting the al_printf function. The computation for the CoreMark-PRO score is described in the included PDF documentation.

Workloads, Kernels, and Datasets

As stated above, each workload compiles to a single binary. The workloads in the workloads folder contain a top-level C-file that instantiates the test harness. A workload consists of one or more benchmark kernels (stored in the kernels folder), and a dataset (see NOTE below). For example, the binary loops-all-mid-10k-sp.exe is compiled from workloads/loops-all-mid-10k-sp. This workload invokes the Livermore Loops kernel from benchmarks/loops/ and configures it to use the ref-sp/10k.c file. This file includes parameters for constructing a 10 KB dataset, as well as the reference data results to compare against after the benchmark completes. Floating point benchmarks check for accuracy by checking a minimum number of bits that are allowed to differ (this is of greater concern in other benchmarks like EEMBC's FPMark, which stresses single- and double-precision performance). Other benchmark kernels contain just the input dataset and no reference, such as the JPEG workload.

NOTE: In CoreMark-PRO, the mapping is 1:1, each workload invokes one kernel. Other MITH-based benchmarks from EEMBC, such as AutoBench 2.0, multiple kernels are arranged in different configurations in each workload.

Memory Usage

The following metrics were captured under Ubuntu 20.04 running on an Intel(R) Core(TM) i5-1035G4 CPU using GCC 9.3.0 (with -O2). The static footprint was taken with size and the dynamic peak with valgrind --stats=yes --profile-heap=yes --tool=massif --stacks=yes --time-unit=B subtool. Values are in bytes.

Component text data bss dec massif peak B
cjpeg-rose7-preset.exe 112,292 268,576 208 381,076 141,488
core.exe 70,356 2,240 2,448 75,044 12,496
linear_alg-mid-100x100-sp.exe 75,277 3,112 1,424 79,813 67,656
loops-all-mid-10k-sp.exe 91,391 4,696 3,664 99,751 3,427,184
nnet_test.exe 73,739 3,568 40,272 117,579 50,528
parser-125k.exe 90,415 2,272 208 92,895 1,043,032
radix2-big-64k.exe 1,449,387 1,904 688 1,451,979 1,580,504
sha-test.exe 80,717 2,184 208 83,109 1,052,272
zip-test.exe 135,245 2,776 208 138,229 3,420,864

Documentation

Please refer to the PDF user guide located in the docs folder of this repository for more details.

More info may be found at the EEMBC CoreMark-PRO website.

Run Rules

What is and is not allowed.

Required

  1. Each workload must run for at least 1000 times the minimum timer resolution. For example, on a 10 ms timer tick based system, each workload must run for at least 10 seconds.
  2. To report results, the build target certify-all must be used or that process must be followed if make is not usable (e.g. via embedded debugger runs); each workload must report no errors when run with -v1.
  3. All workloads within CoreMark-Pro must be compiled with the same flags and linked with the same flags. These must be disclosed and/or reported with any publication of CoreMark-Pro scores.

Allowed

  1. You may change the number of iterations.
  2. You may change toolchain and build/load/run options.
  3. You may change the implementation of porting files under mith/al sub tree.
  4. You may change makefiles or using IDE projects.
  5. Profile guided optimizations are allowed on base run; if used, they must be used for all workloads.

NOT ALLOWED

  1. You may not change the source file under benchmarks or workloads folders.

Baremetal and Other Ports

NEW! A baremetal porting guide has been added to the doc directory of this repository.

The MITH hardare abstraction layer is defined in mith/al/src. These files contain any low-level functions needed by the benchmark. The MITH framework is used for a number of benchmarks, so not all options are relevant to or used by CoreMark-PRO.

The provided implementaiton was tested on 32- and 64-bit Linux distributions, as well as Cygwin. Since the datasets are loaded implicitly as C-structures, file I/O is not used. The only major modification likely needed for an embedded port is how pthreads are implemented. Choices are:

  1. Provide a POSIX thread library
  2. Switch to single-thread mode by using the reference al_single.c instead of al_smp.c
  3. Implement the functions al_smp.c using the target platform's threading SDK

There's no standard flash downloader or response extractor included because every tool chain or IDE behaves differently in this regard. One easy method is to load each compiled firmware image through an IDE debugger and extract the results either by redirecting the th_printf function, or simply reading the IDE debugger output assuming vsprintf is redirected to the IDE or console via the debuggger link. The computation of the CoreMark-PRO score is described on page 12 of the provided PDF user's guide.

Defines

The makefiles automate macro setting and data set inclusion. When not using the makefiles, it can be tricky to determine the proper macro definitions; be sure to set the following macros and use these datasets:

Component Macros File or directory to include
cjpeg-rose7-preset SELECT_PRESET_ID=1, USE_PRESET consumer_v2/cjpeg/*.c
consumer_v2/cjpeg/data/Rose256_bmp.c
core core/core_*.c
linear_alg-mid-100x100-sp USE_FP32=1 fp/linpack/linpack.c
fp/linpack/ref/inputs_f32.c
loops-all-mid-10k-sp USE_FP32=1 fp/loops/loops.c
fp/loops/ref-sp/*.c
nnet_test USE_FP64=1 fp/nnet/nnet.c
fp/nnet/ref/*.c
parser-125k darkmark/parser/*.c
radix2-big-64k USE_FP64=1 fp/fft_radix2/fft_radix2.c
fp/fft_radix2/ref/*.c
sha-test darkmark/sha/*.c
zip-test MITH_MEMORY_ONLY_VERSION, ZLIB_COMPAT_ALL, ZLIB_ANSI darkmark/zip/zip_darkmark.c
darkmark/zip/zlib-1.2.8/*.c but exclude gzread.c and gzwrite.c.

Argv & Argc

Each workload defines a main() that takes argc and argv. In order to run the performance measurement, the benchmark requires the input argument "-v0" (turn off default validation mode), and to follow the run rules, iterations might need to be changed with the "-i" option. Since these options are provided via argv, this may cause problems. If your debugger allows semihosting, you can provide these options through an argument string. If your compiler can rename the entrypoint from main() to something else, you can create a wrapper that calls the workload main() with an argument string, e.g. char *argv[] = { "-v0", "-i100" }; .... If neither options are available, you will need to alter the workload main() function to be main(void) and define your own argc and argv immediately prior to the call to al_main().

The adaptation layer

You are allowed to alter th_al.c. It is expected that the platform startup and init code will need to go in al_main(), as well as porting the clock mechanism defined by the al_signal_*() functions. Often these are simply replaced with an interrupt timer (rather than an RTC) at 1ms resolution. If the timer is not 1ms, you will need to set the CLOCKS_PER_SEC defines in th_al.c.

Computing the Overall Score

The final score is a geometric mean of the components divided by a reference platform score and scaled. If CoreMark-PRO is run on the host system, a PERL script will automatically perform this computation. If run on a remote target, it must be done manually. First collect the iterations-per-second for each of the components. Then divide each component by the reference score shown below, then multiply each term by the scale factor, and take the geometric mean of the resulting values. Finally multiply by 1000.

Component Scale Factor Reference Score
cjpeg-rose7-preset.exe 1 40.3438
core.exe 10000 2855
linear_alg-mid-100x100-sp.exe 1 38.5624
loops-all-mid-10k-sp.exe 1 0.87959
nnet_test.exe 1 1.45853
parser-125k.exe 1 4.81116
radix2-big-64k.exe 1 99.6587
sha-test.exe 1 48.5201
zip-test.exe 1 21.3618

Final Score = GeoMean(s0/r0 * x0, s1/r1 * x1, ..., sN/rN * xN) * 1000

Where sN, rN and xN refer to the current score, reference score and scale factor, respectively, for each of the N components.

Submitting Results

CoreMark-PRO results can be submitted on the web. Open a web browser and go to the submission page. After registering an account you may enter a score.

Publication Rules

  1. As stated in the license, a "Commercial COREMARK-PRO License" from EEMBC is required for Licensee to disclose, reference, or publish test results generated by COREMARK-PRO in Licensee’s marketing of any of Licensee’s commercially‐available, product‐related materials, including, but not limited to product briefs, website, product brochures, product datasheets, or any white paper or article made available for public consumption. (This does not include academic research or personal use)
  2. Scores must be uploaded to the EEMBC CoreMark-PRO website before being published in any capacity to ensure run rules were followed.

Copyright and Licensing

EEMBC and CoreMark are trademarks of EEMBC. Please refer to the file LICENSE.md for the license associated with this benchmark software.

coremark-pro's People

Contributors

petertorelli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

coremark-pro's Issues

regarding the benchmark run option

I would like to know if I run the benchmarks individually and not for the certify mark, do I still need to add -v0 ? Since from the manual it says

"A value of 0 indicates performance mode, a value of 1 indicates verification mode, which invokes
result checking."

Thanks!

Running `loops-all-mid-10k-sp.exe` on PPC64 (e6500) causes illegal instruction

When trying to run the benchmark on a reference board with the e6500 CPU, the test loops-all-mid-10k-sp.exe fails to run with an 'illegal instruction' error.

Trying to debug this, I find that the error is on line 333 running this command: gdb -ex=r ./builds/linux64/gcc64/bin/loops-all-mid-10k-sp.exe
Logs:

Reading symbols from ./builds/linux64/gcc64/bin/loops-all-mid-10k-sp.exe...
Starting program: /run/media/mmcblk0p2/coremark-pro/builds/linux64/gcc64/bin/loops-all-mid-10k-sp.exe
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGILL, Illegal instruction.
0x00000000404a7448 in define_params_loops (idx=<optimized out>, name=<optimized out>, dataset=<optimized out>) at /<redacted>/benchmarks/fp/loops/loops.c:333
333			int m2=(int)th_sqrt((e_fp)(params->N*4));

Taking a closer look at this line after decompiling the frame, I can see a number of instructions:
gdb -batch -ex "disassemble/rs define_params_loops" ./builds/linux64/gcc64/bin/loops-all-mid-10k-sp.exe | less
logs:

333                     int m2=(int)th_sqrt((e_fp)(params->N*4));
   0x000000000000b420 <+688>:   80 fe 00 14     lwz     r7,20(r30)
   0x000000000000b424 <+692>:   3d 22 ff fd     addis   r9,r2,-3
   0x000000000000b428 <+696>:   c0 09 ec 08     lfs     f0,-5112(r9)
   0x000000000000b42c <+700>:   78 e9 10 28     rldic   r9,r7,2,32
   0x000000000000b430 <+704>:   f9 21 00 88     std     r9,136(r1)
   0x000000000000b434 <+708>:   60 42 00 00     ori     r2,r2,0
   0x000000000000b438 <+712>:   c8 21 00 88     lfd     f1,136(r1)
   0x000000000000b43c <+716>:   fd 80 0e 9c     fcfid   f12,f1
   0x000000000000b440 <+720>:   fc 20 60 18     frsp    f1,f12
   0x000000000000b444 <+724>:   fc 01 00 00     fcmpu   cr0,f1,f0
   0x000000000000b448 <+728>:   ef e0 08 2c     fsqrts  f31,f1
   0x000000000000b44c <+732>:   41 80 03 44     blt     0xb790 <define_params_loops+1568>
   0x000000000000b450 <+736>:   ff e0 f8 1e     fctiwz  f31,f31
   0x000000000000b454 <+740>:   39 01 00 80     addi    r8,r1,128
   0x000000000000b458 <+744>:   38 c0 00 20     li      r6,32
   0x000000000000b45c <+748>:   7c c9 03 a6     mtctr   r6
   0x000000000000b460 <+752>:   7c ea 07 b4     extsw   r10,r7
   0x000000000000b464 <+756>:   39 3e 00 18     addi    r9,r30,24
   0x000000000000b468 <+760>:   7f e0 47 ae     stfiwx  f31,0,r8
   0x000000000000b46c <+764>:   e9 1e 01 1e     lwa     r8,284(r30)
   0x000000000000b470 <+768>:   80 a1 00 80     lwz     r5,128(r1)

I have a feeling the issue might be the blt call, but I'm not 100% certain. Any thoughts or help would be useful.

Question about the iteration count in the kernel of "cjpeg-rose7-preset"

In the main() of workload "cjpeg-rose7-preset", I see a lot of calls to helper_cjpegrose7preset() where repeats_override is given 1. Then at this line, leading to these lines, we know tcdef->iterations is thus set to 1.

When this workload is run, kernel t_run_test_cjpeg() is executed.
According to this line, override is 1.
Therefore tcdef->rec_iterations will not be used (this line).

tcdef->rec_iterations is configured here and is supposed to be the "recommended" iteration count according to this and this.

I wonder, why is repeats_override not given 0 to have the recommended value used?

(BTW, the kernel of other workloads does not use tcdef->iterations.)

which Tool I have to use to run coremark-pro on NXP imx rt1170

Hi team,

To test NXP imx rt1170 board performance i need to run the coremark-pro benchmark on the board.
To run the coremark on imx rt1170 which tools(SDK) I need to use to flash and is there any reference guide to run the benchmark on rt1170 boards.

Thanks
chaithanya

linear_alg-mid-100x100-sp and loops-all-mid-10k-sp compilation fails

I use "make TARGET=linux64 build".
The following error occurs:

/usr/bin/gcc -c -g -O2 -DNDEBUG -DHOST_EXAMPLE_CODE=1 -std=gnu99 -DHAVE_SYS_STAT_H=1 -DUSE_NATIVE_PTHREAD=1 -DGCC_INLINE_MACRO=1 -DNO_RESTRICT_QUALIFIER=1 -DEE_SIZEOF_LONG=8 -DEE_SIZEOF_PTR=8 -DEE_PTR_ALIGN=8 -DHAVE_SYS_STAT_H=1 -DUSE_NATIVE_PTHREAD=1 -DGCC_INLINE_MACRO=1 -DNO_RESTRICT_QUALIFIER=1 -DEE_SIZEOF_LONG=8 -DEE_SIZEOF_PTR=8 -DEE_PTR_ALIGN=8 -DEE_SIZEOF_INT=4 -DEE_SIZEOF_LONG=8 -Wall -Wno-long-long -fno-asm -fsigned-char -DUSE_FP32 -I/home/tyliu/project/coremark-pro/mith/include -I/home/tyliu/project/coremark-pro/mith/al/include -I/home/tyliu/project/coremark-pro/mith/al/include /home/tyliu/project/coremark-pro/workloads/linear_alg-mid-100x100-sp/linear_alg-mid-100x100-sp.c -o linear_alg-mid-100x100-sp.o
cd /home/tyliu/project/coremark-pro/builds/linux64/gcc64/obj/bench/fp/linpack/SP && make -f /home/tyliu/project/coremark-pro/benchmarks/fp/linpack/SP/../Makefile build
make[3]: Entering directory '/home/tyliu/project/coremark-pro/builds/linux64/gcc64/obj/bench/fp/linpack/SP'
make[3]: /home/tyliu/project/coremark-pro/benchmarks/fp/linpack/SP/../Makefile: No such file or directory
make[3]: *** No rule to make target '/home/tyliu/project/coremark-pro/benchmarks/fp/linpack/SP/../Makefile'. Stop.
make[3]: Leaving directory '/home/tyliu/project/coremark-pro/builds/linux64/gcc64/obj/bench/fp/linpack/SP'
/home/tyliu/project/coremark-pro/workloads/linear_alg-mid-100x100-sp//Makefile:71: recipe for target '/home/tyliu/project/coremark-pro/builds/linux64/gcc64/obj/bench/fp/linpack/SP/done.build' failed
make[2]: *** [/home/tyliu/project/coremark-pro/builds/linux64/gcc64/obj/bench/fp/linpack/SP/done.build] Error 2
make[2]: Leaving directory '/home/tyliu/project/coremark-pro/builds/linux64/gcc64/obj/workloads/linear_alg-mid-100x100-sp'

Shell script for running

Is there a simple sh script that can be leveraged on embedded targets for running cross-compiled executables with necessary flags. This would be really helpful for for embedded development where targets rarely have the development tools the current process requires.

Race condition in Makefile ?

From time to time, when compiling with "make -j", we have a race condition in Makefile, with messages like:

Fatal error: can't create al/src/th_al.o: No such file or directory

It seems the following patch solves the issue:

diff --git a/mith/Makefile b/mith/Makefile
index faeaea1..71a6f9d 100644
--- a/mith/Makefile
+++ b/mith/Makefile
@@ -54,4 +54,4 @@ print-%:
        @echo $* = $($*)  
        @echo [Defined at $(origin $*)] 
 
-$(TH_OBJS) : $(TH_HDRS)        
\ No newline at end of file
+$(TH_OBJS) : $(TH_HDRS)        $(MYDIRS)

Incorrect Coremark-Pro score calculation

According to the documentation the coremark pro score should be calculated with:

100 x geomean( ( subtestscore / reference score ) * scale factor)

but it's in fact the produced score seems to be:

1000 x geomean( ( subtestscore / reference score ) * scale factor)

The produced score seems to be 10x what it should be

How to run workloads in bare-metal with multicore?

Hi,
I have seen the video tutorial of CoreMark-PRO in YouTube, and I know how to transplant one workload to the bare-metal. But I don’t understand how to transplant nine workload to a bare-metal, did I need to download and run the workload one by one? Run 9 times with respective workload?
If so, what should I do to allocate one workload to multicore? Such as Stm32H747, which has two cores, Cortex-M4 and M7, how can I make two cores work simultaneously to calculate the workload? If I can only use one core each time, it’s obvious that I don’t fully use the resources.

GLIBC_2.27' not found required by arm-linux-gnueabihf

As you suggest "if you had used arm-linux-eabi then it would have compiled. This is unrelated to the benchmark. Here is a good explanation of the three components in the toolchain name."

I have downloaded GNU tool chain version 11.3 “https://snapshots.linaro.org/gnu-toolchain/ but getting the below error.
./linaro/gcc-linaro-11.3.1-2022.06-x86_64_arm-linux-gnueabihf/bin/../lib/gcc/arm-linux-gnueabihf/11.3.1/../../../../arm-linux-gnueabihf/bin/ld: /lib64/libc.so.6: version `GLIBC_2.27' not found (required by ./linaro/gcc-linaro-11.3.1-2022.06-x86_64_arm-linux-gnueabihf/bin/../lib/gcc/arm-linux-gnueabihf/11.3.1/../../../../arm-linux-gnueabihf/bin/ld)

I have downloaded “sysroot-glibc-linaro-2.34-2022.06-arm-linux-gnueabihf” from the gnu-tool chain package and changed the path but still getting GLIB_2.27 not found error.

please help me on this.

Cannot find -lrt and -lpthread during make build with "gcc-arm-none-eabi-10.3-2021.10"

Hi,

Not able to make build with "gcc-arm-none-eabi-10.3-2021.10"

I changed gcc64.mak and common.mak file to add gcc-arm-none-eabi-10.3-2021.10 path. These files are present in util/make directory.

Changes mainly for below lines:
TOOLS = ./coremark-pro-main/gcc-arm/x86_64/gcc-arm-none-eabi-10.3-2021.10
CC = $(TOOLS)/bin/arm-none-eabi-gcc
AS = $(TOOLS)/bin/arm-none-eabi-as
LD = $(TOOLS)/bin/arm-none-eabi-gcc
AR = $(TOOLS)/bin/arm-none-eabi-ar
INCLUDE = $(TOOLS)/arm-none-eabi/include

ARM gcc-arm-none-eabi-10.3-2021.10 package installed from ARM download link

Command= make TARGET=linux64 build-all

Error Message=
make[3]: Leaving directory ./coremark-pro-main/builds/linux64/gcc64/obj/bench/darkmark/zip' ./coremark-pro-main/gcc-arm/x86_64/gcc-arm-none-eabi-10.3-2021.10/bin/arm-none-eabi-gcc -o./coremark-pro-main/builds/linux64/gcc64/bin/zip-test.exe zip-test.o ./coremark-pro-main/builds/linux64/gcc64/obj/bench/darkmark/zip/*.o ./coremark-pro-main/builds/linux64/gcc64/obj/mith.a -lm -lrt -lpthread -lm -lrt -lpthread ./coremark-pro-main/gcc-arm/x86_64/gcc-arm-none-eabi-10.3-2021.10/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: cannot find -lrt ./coremark-pro-main/gcc-arm/x86_64/gcc-arm-none-eabi-10.3-2021.10/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: cannot find -lpthread ./coremark-pro-main/gcc-arm/x86_64/gcc-arm-none-eabi-10.3-2021.10/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: cannot find -lrt ./coremark-pro-main/gcc-arm/x86_64/gcc-arm-none-eabi-10.3-2021.10/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: cannot find -lpthread collect2: error: ld returned 1 exit status make[2]: *** [./coremark-pro-main/builds/linux64/gcc64/bin/zip-test.exe] Error 1 make[2]: Leaving directory ./coremark-pro-main/builds/linux64/gcc64/obj/workloads/zip-test'
make[1]: *** [build-all] Error 2
make[1]: Leaving directory `./coremark-pro-main'
make: *** [build] Error 2

Thanks,
SK

Unable to build the benchmark for ARM AARCH64

Hello,

I'm trying to cross compile the benchmark for ARM AARCH64 .. I'm using: gcc-7.3

I basically changed the following about util/make/gcc64.mk:
CC = $(COMPILER_HOME)/bin/aarch64-buildroot-linux-gnu-gcc
AS = $(COMPILER_HOME)/bin/aarch64-buildroot-linux-gnu-as
LD = $(COMPILER_HOME)/bin/aarch64-buildroot-linux-gnu-ld
AR = $(COMPILER_HOME)/bin/aarch64-buildroot-linux-gnu-ar

Also:
INCLUDE = $(COMPILER_HOME)/include/

However, the following error occurs when I use make TARGET=linux64 build:

coremark-pro/coremark-pro/builds/linux64/gcc64/obj/mith.a(th_lib.o): undefined reference to symbol 'vsnprintf@@GLIBC_2.17' /project/codelink/tools/toolchains/mentor/aarch64/gcc-7.3.0/bin/../aarch64-buildroot-linux-gnu/sysroot/lib64/libc.so.6: error adding symbols: DSO missing from command line

Any idea how to resolve this?
Thank you in advance.

for baremetal run, cjpeg macro variable SELECT_PRESET_ID is wrong

hi there,
coremarkPro with cygwin TARGET=linux64 is ok to run,
but when I try to port package to bare-metal env. without file I/O,
cjpeg always fail with "Cannot find file Rose256.bmp".

In "coremark-pro/workloads/cjpeg-rose7-preset/Makefile"
line 5 export SELECT_PRESET_ID=0
in "coremark-pro/benchmarks/consumer_v2/cjpeg/bmark_lite.c"
line 193-204
SELECT_PRESET_ID==1 maps to Rose256
SELECT_PRESET_ID==2 maps to goose
The mismatch cause failure.
(in cygwin, strace shows cjpeg-rose7-reset.exe uses file I/O system call to get data, not from internal preset buffer)

After correcting the Makefile, cjpeg is passed without error.

Compiling without Make

I'm doing some research related to LLVM and need the LLVM IR of each individual benchmark. This means I have to bypass the Makefile and use clang manually to generate IR from the relevant C files. I've managed to do this but when I then compile the IR to binary and run the benchmark workload, the workload prints that there's been an error. If I compile with make then the benchmarks all run fine.

For example, for the loop kernels benchmark, in the directory coremark-pro/workloads/loops-all-mid-10k-sp, I run the following:
clang -O3 -S -emit-llvm -I ../../mirth/include -I ../../mirth/al/include -I ../../benchmarks/fp/loops -o loops-all-mid-10k-sp.ll

I also defined USE_FP32=1.

I run the same clang command for all the .c files in /mirth/src, /mirth/al/src and /benchmarks/fp/loops before combining all the .ll files into one using llvm-link. I then compile the single IR file to a binary and execute, and get the following:

$ ./a.out -v0
-  Info: Starting Run...
-- Workload:loops-all-mid-10k-sp=1814569103
-- loops-all-mid-10k-sp:time(ns)=7000
-- loops-all-mid-10k-sp:ERRORS=50
-- loops-all-mid-10k-sp:contexts=1
-- loops-all-mid-10k-sp:iterations=50
-- loops-all-mid-10k-sp:time(secs)=       7
-- loops-all-mid-10k-sp:secs/workload=    0.14
-- loops-all-mid-10k-sp:workloads/sec= 7.14286
-- accbits:min=0
-- accbits:max=0
-- accbits:avg=0
-- Done:loops-all-mid-10k-sp=1814569103

What am I missing in my build process to make this work? I've tried going through coremark's own make files but can't figure it out. Any help would be appreciated, thanks.

Query : WORKLOAD RESULTS

Hi,
I am running Coremark-pro on ARM by running individual binary and i believe you are using following

1000 x geomean( ( subtestscore / reference score ) * scale factor)

but i am not able to understand how you are calculating subtestscore and reference score

I am attaching output for cjpeg-rose7-preset.exe

pankaj@pankaj~# ./cjpeg-rose7-preset.exe
Rose256.bmp data supplied by C array

Data Set : Rose256.bmp
Output File : Rose256.jpg
Rose256.bmp data supplied by C array
Data Set : Rose256.bmp
Output File : Rose256.jpg
Rose256.bmp data supplied by C array
Data Set : Rose256.bmp
Output File : Rose256.jpg
Rose256.bmp data supplied by C array
Data Set : Rose256.bmp
Output File : Rose256.jpg
Rose256.bmp data supplied by C array
Data Set : Rose256.bmp
Output File : Rose256.jpg
Rose256.bmp data supplied by C array
Data Set : Rose256.bmp
Output File : Rose256.jpg
Rose256.bmp data supplied by C array
Data Set : Rose256.bmp
Output File : Rose256.jpg

  • Info: Starting Run...
    -- Workload:cjpeg-rose7-preset=236760500
    -- cjpeg-rose7-preset:time(ns)=2009
    -- cjpeg-rose7-preset:contexts=1
    -- cjpeg-rose7-preset:iterations=1
    -- cjpeg-rose7-preset:time(secs)= 2.009
    -- cjpeg-rose7-preset:secs/workload= 2.009
    -- cjpeg-rose7-preset:workloads/sec= 0.49776
    Info: This run was executed with verification turned on! For performance results, use -v0.
    -- cjpeg-data1:UID=10000
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=269
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.269
    -- cjpeg-data1:repeats/sec= 3.71747
    -- cjpeg-data1:time(secs)= 0.269
    -- cjpeg-data1:secs/item= 0.269
    -- cjpeg-data1:items/sec= 3.71747
    -- cjpeg-data1:UID=10001
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=262
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.262
    -- cjpeg-data1:repeats/sec= 3.81679
    -- cjpeg-data1:time(secs)= 0.262
    -- cjpeg-data1:secs/item= 0.262
    -- cjpeg-data1:items/sec= 3.81679
    -- cjpeg-data1:UID=10002
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=264
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.264
    -- cjpeg-data1:repeats/sec= 3.78788
    -- cjpeg-data1:time(secs)= 0.264
    -- cjpeg-data1:secs/item= 0.264
    -- cjpeg-data1:items/sec= 3.78788
    -- cjpeg-data1:UID=10003
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=265
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.265
    -- cjpeg-data1:repeats/sec= 3.77358
    -- cjpeg-data1:time(secs)= 0.265
    -- cjpeg-data1:secs/item= 0.265
    -- cjpeg-data1:items/sec= 3.77358
    -- cjpeg-data1:UID=10004
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=249
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.249
    -- cjpeg-data1:repeats/sec= 4.01606
    -- cjpeg-data1:time(secs)= 0.249
    -- cjpeg-data1:secs/item= 0.249
    -- cjpeg-data1:items/sec= 4.01606
    -- cjpeg-data1:UID=10005
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=250
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.25
    -- cjpeg-data1:repeats/sec= 4
    -- cjpeg-data1:time(secs)= 0.25
    -- cjpeg-data1:secs/item= 0.25
    -- cjpeg-data1:items/sec= 4
    -- cjpeg-data1:UID=10006
    -- cjpeg-data1:fails=0
    -- cjpeg-data1:time(ticks)=259
    -- cjpeg-data1:count=1
    -- cjpeg-data1:repeats=1
    -- cjpeg-data1:v1=0
    -- cjpeg-data1:v2=0
    -- cjpeg-data1:v3=0
    -- cjpeg-data1:v4=0
    -- cjpeg-data1:f1=0.000000e+00
    -- cjpeg-data1:f2=0.000000e+00
    -- cjpeg-data1:f3=0.000000e+00
    -- cjpeg-data1:f4=0.000000e+00
    -- cjpeg-data1:secs/repeat= 0.259
    -- cjpeg-data1:repeats/sec= 3.861
    -- cjpeg-data1:time(secs)= 0.259
    -- cjpeg-data1:secs/item= 0.259
    -- cjpeg-data1:items/sec= 3.861
    -- Items:total(ticks)=1818
    -- Items:total(secs)= 1.818
    -- Done:cjpeg-rose7-preset=236760500

Can you please help me from above result how we can get subtestscore and reference score and also following ???

  1. MultiCore (iter/s)
  2. SingleCore (iter/s)
  3. Scaling

Thanks

make TARGET=linux certify-all not computing results

Hi,
iam running coremark-pro in arm based hardware which is 64-bit and 32-bit supported board.
when i ran in specifying the TARGET=linux64 it is computing the score but when i changed it to TARGET=linux
the overall score is not generating.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.