Giter Club home page Giter Club logo

freqbench's Introduction

freqbench

Power usage in mW per frequency per cluster for Qualcomm Snapdragon 835, 855, and 765G

freqbench is a comprehensive CPU benchmark that benchmarks each CPU frequency step on each frequency scaling domain (e.g. ARM DynamIQ/big.LITTLE cluster). It is based on a minimal Alpine Linux userspace with the EEMBC CoreMark workload and a Python benchmark coordinator.

Results include:

  • Performance (CoreMark scores)
  • Performance efficiency (CoreMarks per MHz)
  • Power usage (in milliwatts)
  • Energy usage (in millijoules and joules)
  • Energy efficiency (ULPMark-CM scores: iterations per second per millijoule of energy used)
  • Baseline power usage
  • Time elapsed
  • CPU frequency scaling stats during the benchmark (for validation)
  • Diagnostic data (logs, kernel version, kernel command line, interrupts, processes)
  • Raw power samples in machine-readable JSON format (for postprocessing)

Why?

A benchmark like this can be useful for many reasons:

  • Creating energy models for EAS (Energy-Aware Scheduling)
  • Correcting inaccurate EAS energy models
  • Analyzing performance and power trends
  • Comparing efficiency across SoC and CPU generations
  • Improving performance and battery life of mobile devices by utilizing the race-to-idle phenomenon with efficient frequencies

Usage

It is possible to use freqbench with a stock kernel, but a custom kernel is highly recommended for accuracy. Stock OEM kernels are almost always missing features that the benchmark coordinator relies on for maximum accuracy. Custom kernel results are eligible for high accuracy classification, while stock kernel results are limited to low accuracy. Use a stock kernel at your own risk.

Custom kernel (recommended)

Set the following kernel config options:

CONFIG_NO_HZ_FULL=y
CONFIG_CPU_FREQ_TIMES=n  # may not exist
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_HZ_100=y

Example commit: kirin_defconfig: Configure for freqbench

If you have any commits that prevent userspace from controlling CPU affinities and utilization, frequencies, or anything of the sort, revert them for the benchmark to work properly. Here are some common examples of such commits in downstream kernels and their corresponding reverts:

Example freqbench kernel adaptations:

Compile and flash your new kernel. Note that Android will not work properly on this kernel, so make sure you take a backup of your old boot image to restore later.

If necessary, adjust the config parameters in config.sh. Most modern devices will not need any changes. Run pack-zip.sh and flash freqbench-installer.zip.

Unplug the device immediately, before the device starts booting. Do not try to wait for it to finish booting. Leaving the device plugged in will invalidate all power results.

Finally, wait until the device reboots itself. Do not touch the device, any of its buttons, or plug/unplug it during the test. It will be frozen on the bootloader splash screen; do not assume that it is broken. The benchmark is expected to take a long time; 1 hour is reasonable for a slower CPU.

Once the benchmark is done, retrieve the results from /cache/freqbench if your device has a cache partition, or /persist/freqbench otherwise (newer devices with A/B partitions don't have a cache partition).

If you are able to retrieve results, please consider contributing your results! It's very helpful for me to see how well freqbench is working, and enables anyone to analyze results across different SoCs that they don't have.

If you have any problems, check the troubleshooting section before opening an issue.

Manual boot image creation

Manually creating a new boot image with the kernel and ramdisk is only for advanced users. Use the AnyKernel3 installer unless you have good reason to do this.

Additional kernel config options:

CONFIG_CMDLINE="rcu_nocbs=0-7 isolcpus=1-7 nohz_full=1-7 loglevel=0 printk.devkmsg=on"
CONFIG_CMDLINE_EXTEND=y

If you don't have 8 CPU cores, adjust 0-7 to 0-<core count - 1> and 1-7 to 1-<core count - 1> where appropriate. Single-core CPUs are not supported. Be careful when adjusting the CPU sets as rcu_nocbs starts at CPU 0 while all other parameters start at CPU 1.

Create a boot image with your modified kernel and the freqbench ramdisk:

For boot image v0/v1 devices:

cd boot-v1
./unpack.sh path/to/normal/boot.img
./pack.sh
# New boot image will be created as new.img

For boot image v3 devices:

# Extract values from boot.img and update pack-img.sh accordingly
./pack-img.sh
# New boot image will be created as bench.img

After that, boot the modified image with fastboot boot if your device supports it, or flash it to the boot/recovery partition and boot that manually.

Results

After the benchmark finishes, results can be found in /cache/freqbench, /persist/freqbench, or /mnt/vendor/persist/freqbench, in that order of preference. The first path that exists on your device will be used. Human-readable results, raw machine-readable JSON data, and diagnostic information are included for analysis.

If you got this far, please consider contributing your results to help freqbench evolve and gather data about different SoCs:

Contributing results

If you run the benchmark on a SoC that is not already included, please contribute your results! It's very helpful for me to see how well freqbench is working, and enables anyone to analyze results across different SoCs that they don't have.

Contributing your results is easy:

  1. Fork this repository
  2. Add your entire results folder (not just one file from it) to results/socname/main, replacing socname with the model name of your SoC in lowercase
  3. Open a pull request

If you don't know your SoC's model name, search the name of your SoC (e.g. Snapdragon 855) and find the part number from the SoC manufacturer. You can also get it from your kernel source code and/or device.txt or cpuinfo.txt in the freqbench results. If you are still unsure, feel free to open an issue or guess the name.

Example names:

  • sm8150
  • sm8150ac
  • sm7250ab
  • exynos8895
  • mt6889

Identifiable information such as the device serial number is automatically redacted by freqbench, so it should not be a problem.

Don't worry about getting something wrong; I would much rather have results submitted with mistakes than nothing at all.

Post-processing

Several post-processing scripts, all written in Python and some using matplotlib, are available:

Legacy energy model

Create a legacy EAS energy model for use with older kernels.

Optional argument after path to results: key_type/value_type

Key types:

  • Frequency (default) - looks like 652800 or 2323200
  • Capacity - looks like 139 or 1024

You must use the correct key type for your kernel. When in doubt, refer to your original energy model and check which one the numbers look more like.

In general, downstream Qualcomm kernels will use the following key types depending on version:

  • 3.18: capacity
  • 4.4: capacity
  • 4.9: frequency
  • 4.14: frequency
  • 4.19: N/A (uses simplified energy model instead)

Modifying your kernel to switch from one to the other is left as an exercise for the reader.

Value types:

  • Power (default)
  • Energy (experimental, not recommended)

Do not change the value type unless you know what you're doing. The energy type only exists for testing purposes; do not expect it to work properly.

Once you have a full energy model generated, pick out the parts you need and incorporate them into your SoC device tree. In general, kernels 4.19 need capacity-dmips-mhz, while older kernels need efficiency when it comes to the contents of the CPU sections.

If you have an existing energy model that you want to use for idle and cluster costs, add it as an argument.

Example usage: ./legacy_energy_model.py results.json cap/power old_model.dtsi

Simplified energy model

Create a simplified EAS energy model for use with newer kernels.

Because voltages defined by the CPU frequency scaling driver cannot easily be accessed from userspace, you will need to provide them. Pass the voltage for each frequency step as an argument: cpu#.khz=microvolts

For Qualcomm SoCs on the msm-4.19 kernel, voltages can be obtained by booting the kernel (with or without freqbench doesn't matter, as long as you can get kernel logs) with this commit and searching for lines containing volt= in the kernel log.

For msm-4.9 and msm-4.14, the process is the same but with this commit and searching for open_loop_voltage instead.

Example usage: ./simplified_energy_model.py results.json 1.300000=580000 1.576000=580000 1.614400=580000 1.864000=644000 1.1075200=708000 1.1363200=788000 1.1516800=860000 1.1651200=888000 1.1804800=968000 6.652800=624000 6.940800=672000 6.1152000=704000 6.1478400=752000 6.1728000=820000 6.1900800=864000 6.2092800=916000 6.2208000=948000 7.806400=564000 7.1094400=624000 7.1401600=696000 7.1766400=776000 7.1996800=836000 7.2188800=888000 7.2304000=916000 7.2400000=940000

Efficient frequencies (experimental)

Derive a list of efficient frequencies for each cluster and create a new results.json with only those frequencies included.

Note that this script is experimental and may not produce optimal results. Manual tuning of the resulting frequency tables is recommended.

Example usage: ./efficient_freqs.py results.json eff_results.json

Filter frequencies

Create a new results.json with only the specified frequencies included.

Example usage: ./filter_freqs.py results.json filtered_results.json 1.1516800 1.1804800 6.1478400 6.1728000 6.2208000 7.1766400 7.2188800 7.2304000 7.2400000

Cross-CPU cluster graph

Performance (iter/s) across 835, 855, and 765G

Graph a value for each cluster across different SoCs/CPUs.

Arguments:

  • Add a SoC: SoC-1:soc1/results.json
  • Specify the value to graph: load/value (load is idle/active)
  • Set a flag: +flagname (soccolor, minscl)

Example usage: ./cross_cpu_cluster_graph.py 835:results/p2/main/results.json 855:results/zf6/main/results.json 855+:results/rog2/main/results.json 765G:results/p5/new-final/results.json active/power_mean +soccolor +minscl

Unified cluster graph

Performance (iter/s) across 765G little, big, and prime clusters

Graph a value for each cluster within the same SoC/CPU.

Example usage: ./unified_cluster_graph.py results.json coremark_score

Unified cluster column

Extract a value for each cluster within the same SoC/CPU and write the results into a CSV file.

Example usage: ./unified_cluster_csv.py results.json coremark_score cm_scores.csv

Troubleshooting

Kernel panics on boot

If your kernel panics on boot, disable CONFIG_CPU_FREQ_STAT. If that causes the kernel to fail to compile, cherry-pick cpufreq: Fix build with stats disabled.

Results vary too much

Check kernel.log, pre- and post-bench interrupts, running processes, and cpufreq stats from the results directory to diagnose the issue.

It's still running after an hour

If you have a slow CPU with a lot of frequency steps, this is not entirely unreasonable.

I want to debug it while it's running

freqbench offers interactive debugging via SSH over virtual USB Ethernet; the device acts as a USB Ethernet adapter and exposes an SSH server on the internal network. This feature can be enabled with the USB_DEBUG option in config.sh. It is disabled by default to avoid unnecessary USB setup that may influence benchmark results, so keeping it enabled for a final benchmark run is not recommended.

CONFIG_USB_CONFIGFS_RNDIS must be enabled for this feature to work. If your kernel does not have or use configfs for USB configuration, it will not work regardless of whether you have the RNDIS function enabled.

Once it's enabled, connect your device to a computer over USB. You should see something like this in your kernel logs if you are running Linux:

[7064379.627645] usb 7-3: new high-speed USB device number 114 using xhci_hcd
[7064379.772208] usb 7-3: New USB device found, idVendor=0b05, idProduct=4daf, bcdDevice= 4.14
[7064379.772210] usb 7-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[7064379.772211] usb 7-3: Product: Alpine GNU/Linux
[7064379.772211] usb 7-3: Manufacturer: Linux
[7064379.772212] usb 7-3: SerialNumber: ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]
[7064379.818904] rndis_host 7-3:1.0 usb0: register 'rndis_host' at usb-0000:47:00.1-3, RNDIS device, da:34:ab:99:c5:81
[7064379.870018] rndis_host 7-3:1.0 enp71s0f1u3: renamed from usb0

Run the SSH command in the serial number field to open a shell to the device. The password is empty, so just press enter when asked to provide a password.

freqbench's People

Contributors

aarqw12 avatar arbitraryfox avatar cyberknight777 avatar dreamisbaka avatar electimon avatar eun0115 avatar flamingradian avatar gamer13433 avatar jjpprrrr avatar jprimero15 avatar kawaaii avatar kdrag0n avatar kenhv avatar kerneltoast avatar kondors1995 avatar krishnakantshedge avatar mrartemsid avatar mvaisakh avatar nakixii avatar orges avatar panchajanya1999 avatar pokemetti avatar reinazhard avatar rote66 avatar snnbyyds avatar sourajitk avatar stendro avatar tashar02 avatar xnombre avatar yarost12 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

freqbench's Issues

sdm845: Maximum frequency error on big cluster

Freqbench stopped after about 20 minutes and rebooted to fastboot. I was only able to get two files from /cache/freqbench directory: pre_bench_interrupts.txt and run.log. run.log has all frequency results for CPU1 but failed on CPU4.


  __                _                     _     
 / _|_ __ ___  __ _| |__   ___ _ __   ___| |__  
| |_| '__/ _ \/ _` | '_ \ / _ \ '_ \ / __| '_ \ 
|  _| | |  __/ (_| | |_) |  __/ | | | (__| | | |
|_| |_|  \___|\__, |_.__/ \___|_| |_|\___|_| |_|
                 |_|                            

           CPU benchmark • by kdrag0n

------------------------------------------------

Frequency domains: cpu1 cpu4 
Offline CPUs: cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7 
Sampling power every 1000 ms
Baseline power usage: 1288 mW


===== CPU 1 =====
Frequencies: 300 403 480 576 652 748 825 902 979 1056 1132 1228 1324 1420 1516 1612 1689 1766

 300:  1114     3.7 C/MHz     56 mW   12.7 J   19.8 I/mJ   224.5 s
 403:  1497     3.7 C/MHz     48 mW    8.1 J   31.0 I/mJ   167.0 s
 480:  1782     3.7 C/MHz     51 mW    7.2 J   34.9 I/mJ   140.3 s
 576:  2138     3.7 C/MHz     60 mW    7.0 J   35.8 I/mJ   116.9 s
 652:  2424     3.7 C/MHz     63 mW    6.5 J   38.5 I/mJ   103.2 s
 748:  2780     3.7 C/MHz     64 mW    5.7 J   43.5 I/mJ    89.9 s
 825:  3065     3.7 C/MHz     84 mW    6.8 J   36.6 I/mJ    81.6 s
 902:  3350     3.7 C/MHz     73 mW    5.5 J   45.7 I/mJ    74.6 s
 979:  3635     3.7 C/MHz     93 mW    6.4 J   38.9 I/mJ    68.8 s
1056:  3921     3.7 C/MHz    102 mW    6.5 J   38.5 I/mJ    63.8 s
1132:  4205     3.7 C/MHz    107 mW    6.4 J   39.1 I/mJ    59.5 s
1228:  4562     3.7 C/MHz    119 mW    6.5 J   38.4 I/mJ    54.8 s
1324:  4918     3.7 C/MHz    152 mW    7.7 J   32.4 I/mJ    50.8 s
1420:  5275     3.7 C/MHz    132 mW    6.3 J   39.9 I/mJ    47.4 s
1516:  5631     3.7 C/MHz    155 mW    6.9 J   36.4 I/mJ    44.4 s
1612:  5988     3.7 C/MHz    147 mW    6.1 J   40.8 I/mJ    41.8 s
1689:  6273     3.7 C/MHz    150 mW    6.0 J   41.8 I/mJ    39.9 s
1766:  6558     3.7 C/MHz    172 mW    6.6 J   38.1 I/mJ    38.1 s


===== CPU 4 =====
Traceback (most recent call last):
  File "/bench.py", line 479, in <module>
    main()
  File "/bench.py", line 335, in main
    raise ValueError(f"Maximum frequency setting {max(freqs)} rejected by kernel; got {real_max_freq}")
ValueError: Maximum frequency setting 2649600 rejected by kernel; got 1286400

Are there any other commits I need to revert, or configs I did not turn off? I used a modified lineage xiaomi sdm845 kernel (link)with HEAD at fd19a6e41bc486f0ae82e55112c4618049b52d5b.

I added/changed the following configs for polaris_defconfig:

CONFIG_DEVTMPFS=y
CONFIG_NO_HZ_FULL=y
CONFIG_CPU_FREQ_STAT=n
CONFIG_CPU_FREQ_TIMES=n

sdm710

sdm710 directory contains sdm712 SoC's data

Benchmark is running too fast on 3.2GHz Cortex-A77

Less than 10 seconds

phone:~# echo 1 > /sys/devices/system/cpu/cpu7/online 
phone:~# taskset -c 7 coremark 0x0 0x0 0x66 250000 7 1 2000
2K performance run parameters for coremark.
CoreMark Size    : 666
Total ticks      : 9264
Total time (secs): 9.264000
Iterations/Sec   : 26986.183074
ERROR! Must execute for at least 10 secs for a valid result!
Iterations       : 250000
Compiler version : GCC9.3.0
Compiler flags   : -O2 -DPERFORMANCE_RUN=1  -lrt
Memory location  : Please put data memory location here
			(e.g. code in flash, data on heap etc)
seedcrc          : 0xe9f5
[0]crclist       : 0xe714
[0]crcmatrix     : 0x1fd7
[0]crcstate      : 0x8e3a
[0]crcfinal      : 0x5275
Errors detected

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.