Giter Club home page Giter Club logo

scope's Introduction

Scope

Scope may be downloaded from https://github.com/c3sr/scope/releases

master
Build Status

A benchmark framework developed by the IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3SR) in collaboration with the IMPACT group at the University of Illinois.

Primary maintainers:

Project Advisors:

  • Prof. Wen-mei Hwu (UofI)
  • Dr. Jinjun Xiong (IBM Research)

Various benchmark suites using Scope are under development:

  • Comm|Scope - CUDA/NUMA data transfer performance (Carl Pearson, UIUC)
  • NCCL|Scope - GPU collective communication performance (Sarah Hashash, UIUC)
  • Histo|Scope - CUDA histogram techniques (Carl Pearson, UIUC)
  • DDL|Scope - IBM Distributed Deep Learning Library benchmarks (Vandana Kulkarni, UIUC)
  • TCU|Scope - CUDA/TCU performance primitives (Abdul Dakkak, UIUC)
  • FrameworkLayer|Scope - Evaluation of neural network layers across frameworks (Cheng Li and Abdul Dakkak, UIUC)
  • CUDNN|Scope - Evaluation of neural network layers using CuDNN(Cheng Li and Abdul Dakkak, UIUC)
  • Misc|Scope - experimental or miscellaneous benchmarks

Quick Start

  • Install CMake 3.12+
  • clone, checkout the lastest release, update submodules to match, and build
git clone https://github.com/c3sr/scope.git --recursive
cd scope
git checkout v1.3.2                 # or the latest, `git tag --list`
git submodule update                # match benchmark versions
mkdir build && cd build
cmake .. -DENABLE_COMM=ON           # or other scopes
make -j`nproc`
./scope --benchmark_list_tests=true # list all scopes

Install CMake >= 3.12

User install of CMake 3.12 (preferred)

If your system has CMake < 3.12, we suggest installing CMake 3.12+ in the user's $HOME directory.

On x86-64, the following will download CMake 3.12.0 and install it in $HOME/software/cmake-3.12.0.

cd /tmp
wget https://cmake.org/files/v3.12/cmake-3.12.0-Linux-x86_64.sh
mkdir -p $HOME/software/cmake-3.12.0
sudo sh cmake-3.12.0-Linux-x86_64.sh --prefix=$HOME/software/cmake-3.12.0 --exclude-subdir

You will then need to add $HOME/software/cmake-3.12.0/bin to your path. For many linux users, you add this to your $HOME/.bashrc:

export PATH="$PATH:$HOME/software/cmake-3.12.0/bin"`

On ppc64le, you will need to download the CMake source from the CMake website and build it.

System install of CMake 3.12

If you don't already know how to do this before reading, this is probably not the right option for you. First, uninstall any existing system install of CMake. Then, follow the User install instructions above, but choose a system prefix for the installation.

Compile

To compile the project run the following commands (making sure nvcc is in your $PATH, which is typically at /usr/local/cuda/bin/nvcc)

git clone https://github.com/c3sr/scope.git --recursive
cd scope
mkdir -p build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make

The build system uses Hunter to download all dependencies. If you have trouble downloading dependencies, check to make sure Hunter/CMake can use SSL. Or you can forego Hunter entirely and provide your own dependencies.

You will need to enable the particular scopes that provide the benchmarks you want to run

Scope CMake Option
CuDNN -DENABLE_CUDNN=1
NCCL -DENABLE_NCCL=1
Comm -DENABLE_COMM=1
Example -DENABLE_EXAMPLE=1 (default)

if you get errors about nvcc not supporting your gcc compiler, then you may want to use

cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CUDA_HOST_COMPILER=`which gcc-6` .. 

You can optionally choose your own CUDA archs that you would like to be compiled:

cmake -DNVCC_ARCH_FLAGS="2.0 2.1 2.0 2.1 3.0 3.2 3.5 3.7 5.0 5.2 5.3" ..

The accepted syntax is the same as the CUDA_SELECT_NVCC_ARCH_FLAGS syntax in the FindCUDA module.

You can disable or enable individual scopes

cmake -DENABLE_MISC=0 ...

The submodules should automatically be checked out. If not, try checking them out yourself:

git submodule update --init --recursive

or to update modules to the proper verions

git submodule update --recursive --remote

Available Benchmarks

The available benchmarks and descriptions are listed here. You can list all the benchmarks with

./scope --benchmark_list_tests=true

you can filter the benchmarks that are run with a regular expression passed to --benchmark_filter.

./scope --benchmark_filter=[regex]

for example

./scope --benchmark_filter=SGEMM

futher controls over the benchmarks are explained in the --help option

Run all the benchmarks

This is not generally recommended, as it will take quite some time.

./scope

The above will output to stdout something like

------------------------------------------------------------------------------
Benchmark                       Time           CPU Iterations UserCounters...
------------------------------------------------------------------------------
SGEMM/1000/1/1/-1/1             5 us          5 us     126475 K=1 M=1000 N=1 alpha=-1 beta=1
SGEMM/128/169/1728/1/0        539 us        534 us       1314 K=1.728k M=128 N=169 alpha=1 beta=0
SGEMM/128/729/1200/1/0       1042 us       1035 us        689 K=1.2k M=128 N=729 alpha=1 beta=0
SGEMM/192/169/1728/1/0        729 us        724 us        869 K=1.728k M=192 N=169 alpha=1 beta=0
SGEMM/256/169/1/1/1             9 us          9 us      75928 K=1 M=256 N=169 alpha=1 beta=1
SGEMM/256/729/1/1/1            35 us         35 us      20285 K=1 M=256 N=729 alpha=1 beta=1
SGEMM/384/169/1/1/1            18 us         18 us      45886 K=1 M=384 N=169 alpha=1 beta=1
SGEMM/384/169/2304/1/0       2475 us       2412 us        327 K=2.304k M=384 N=169 alpha=1 beta=0
SGEMM/50/1000/1/1/1            10 us         10 us      73312 K=1 M=50 N=1000 alpha=1 beta=1
SGEMM/50/1000/4096/1/0       6364 us       5803 us        100 K=4.096k M=50 N=1000 alpha=1 beta=0
SGEMM/50/4096/1/1/1            46 us         45 us      13491 K=1 M=50 N=4.096k alpha=1 beta=1
SGEMM/50/4096/4096/1/0      29223 us      26913 us         20 K=4.096k M=50 N=4.096k alpha=1 beta=0
SGEMM/50/4096/9216/1/0      55410 us      55181 us         10 K=9.216k M=50 N=4.096k alpha=1 beta=0
SGEMM/96/3025/1/1/1            55 us         51 us      14408 K=1 M=96 N=3.025k alpha=1 beta=1
SGEMM/96/3025/363/1/0        1313 us       1295 us        570 K=363 M=96 N=3.025k alpha=1 beta=0

Output as JSON using

./scope --benchmark_out_format=json --benchmark_out=test.json

or preferably

./scope --benchmark_out_format=json --benchmark_out=`hostname`.json

Repeat benchmark runs with

./scope --benchmark_repetitions=5

Plot Benchmark JSON files

Try the ScopePlot python package.

pip install scope_plot

On Minsky With PowerAI

cd build && rm -fr * && OpenBLAS=/opt/DL/openblas cmake -DCMAKE_BUILD_TYPE=Release .. -DOpenBLAS=/opt/DL/openblas

Disable CPU frequency scaling

If you see this error:

***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.

you might want to disable the CPU frequency scaling while running the benchmark. On ubuntu, install

apt install linux-tools-$(uname -r)

then

sudo cpupower frequency-set --governor performance
./scope
sudo cpupower frequency-set --governor powersave

Run with Docker

Install nvidia-docker, then, list the available benchmarks.

nvidia-docker run  --rm raiproject/microbench:amd64-latest bench --benchmark_list_tests

You can run benchmarks in the following way (probably with the --benchmark_filter flag).

nvidia-docker run --privileged --rm -v `readlink -f .`:/data -u `id -u`:`id -g` raiproject/microbench:amd64-latest ./numa-separate-process.sh dgx bench /data/sync2
  • --privileged is needed to set the NUMA policy if NUMA benchmarks are to be run.
  • -v `readlink -f .`:/data maps the current directory into the container as /data.
  • --benchmark_out=/data/\`hostname`.json tells the bench binary to write the json output files to /data in the container, which is mapped to the current directory.
  • -u `id -u`:`id -g` tells docker to run as user id -u and group id -g, which is the current user and group. This means that files that docker produces will be modifiable from the host system without root permission.

Hunter Toolchain File

If some of the third-party code compiled by hunter needs a different compiler, you can create a cmake toolchain file to set various cmake variables that will be globally used when building that code. You can then pass this file into cmake

cmake -DCMAKE_TOOLCHAIN_FILE=toolchain.cmake ...

Adding a new benchmark

If you would like to develop a benchmark suite, read here for more information. Also, check out the Example|Scope for a template to get started

Third-Party Resources

scope's People

Contributors

abduld avatar cli99 avatar cwpearson avatar hashash2 avatar jinjunxiong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

simongdg

scope's Issues

How to print the version when a submodule is detached

there is no refspec when the module is detached, so our current version logic prints unknown.

  1. When we are detached from a branch, we could print detached-<hash>
  2. When we are detached we could print the project version
  3. we could always print the project version, followed by a commit
  4. ...

Provide a CMake function that adds the right includes for scopes

currently, scope CMakeLists.txt need to have something like

target_include_directories(example_scope PRIVATE 
    ${SCOPE_SRC_DIR}
    ${THIRDPARTY_DIR}
    ${TOP_DIR}/include
    ${CUDA_INCLUDE_DIRS}
    ${PROJECT_BINARY_DIR}/src
    "src"
)

it might be nice to have them just need to do

include(TargetScopeIncludeDirectories)
target_scope_include_directories(example_scope)

Move ENABLE_<SCOPE> CMake options to submodules

To facilitate people adding scopes, we can make the following changes:

  1. Automatically treat any folder in scopes as a scope.
  2. Define the ENABLE_ option in the submodule

We'll also need each submodule to set a variable in the parent scope that has the target that SCOPE will need to link against.

# example_scope CMakeLists.txt

option(ENABLE_EXAMPLE "Include Example|Scope (github.com/c3sr/example_scope)" ON)
if (ENABLE_EXAMPLE)
  scope_status("Enabling Example|Scope")
  add_subdirectory(${SCOPE_SCOPES_DIR}/example_scope EXCLUDE_FROM_ALL)
  target_link_libraries(scope example_scope)
  set(SCOPE_NEW_TARGET example_scope PARENT)
endif(ENABLE_EXAMPLE)

Then SCOPE can loop through all the dirs in scopes and also do target_link_libraries(scope ${SCOPE_NEW_TARGET})

SCOPE can provide a scope_add_library function that would just wrap add_library but also set SCOPE_NEW_TARGET variable in the parent scope.

Provide easy versioning for scopes

It would be nice if there was infrastructure provided for scopes to easily add versioning.

Some combination of CMake functions to get git tags/branches or read a VERSION file (if git is not used) and providing a simple template for parsing the --version flag on the command line.

Avoid passing spurious arguments to benchmark

When a scope defines a new argument, we'd like to avoid passing that to benchmark so that we can allow benchmark to report unrecognized arguments, but we'd also like to disallow scopes from arbitrarily modifying command line arguments in case they have some kind of bug.

Improve CMake installation instructions

The current CMake installation suggestions will override any system installed CMake. This is not so good for inexperienced users. We can add two paths:

  1. suggest uninstalling the system CMake and then installing CMake 3.12 in a system directory
  2. suggest how to do a per-user install of the new CMake.

remove unneeded third-party projects

  • remove submodules
  • remove cub
  • remove GitUtils
  • removed cublas
  • remove third_party/cafe_proto
  • remove third_party/catch
  • remove third_party/clara
  • remove third_party/enum
  • remove third_party/range
  • remove third_party/safeenum
  • remove third_party/stb
  • remove OpenBLAS

Any ETA for tensorcore/scope availability?

just quick read “Accelerating Reduction and Scan Using Tensor Core Units” and think code is part of scope tensorcore package, right? also do you plan on offering somewhere patches for your optimized cub as used in paper:
“CUB does not contain these shuffle-based optimizations for half precision. To make the evaluations fair and more technically meaningful, we implement these optimization for the half precision data type in CUB. The modified CUB is used for the evaluation to provide a more aggressive base of comparison”
thanks..

Compilation error with XLClang 16.1.1-3

When compiling, we get

[100%] Linking CXX executable scope
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Install/lib64/libfmtd.a(format.cc.o): In function `fmt::internal::FormatterBase::next_arg(char const*&)':
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Build/fmt/Source/fmt/format.h:2214: undefined reference to `_fill'
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Install/lib64/libfmtd.a(format.cc.o): In function `fmt::internal::FormatterBase::get_arg(unsigned int, char const*&)':
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Build/fmt/Source/fmt/format.h:2220: undefined reference to `_fill'
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Install/lib64/libfmtd.a(format.cc.o): In function `fmt::BasicFormatter<char, fmt::ArgFormatter<char> >::get_arg(fmt::BasicStringRef<char>, char const*&)':
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Build/fmt/Source/fmt/format.h:3809: undefined reference to `_fill'
/ccs/home/merth/.hunter/_Base/51d2d6b/8e06572/d68db6d/Install/lib64/libfmtd.a(format.cc.o):(.eh_frame+0x13): undefined reference to `__IBMCPlusPlusExceptionV3'
/usr/bin/ld: link errors found, deleting executable `scope'
collect2: error: ld returned 1 exit sta

This is fixed with module load gcc followed by cmake .. -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ -DCMAKE_CUDA_HOST_COMPILER=which gcc``

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.