Giter Club home page Giter Club logo

rocprim's Introduction

rocPRIM

The rocPRIM is a header-only library providing HIP parallel primitives for developing performant GPU-accelerated code on AMD ROCm platform.

Requirements

  • Git
  • CMake (3.5.1 or later)
  • AMD ROCm platform (1.8.2 or later)

Optional:

  • GTest
    • Required only for tests. Building tests is enabled by default.
    • It will be automatically downloaded and built by cmake script.
  • Google Benchmark
    • Required only for benchmarks. Building benchmarks is off by default.
    • It will be automatically downloaded and built by cmake script.

Build and Install

git clone https://github.com/ROCmSoftwarePlatform/rocPRIM.git

# Go to rocPRIM directory, create and go to the build directory.
cd rocPRIM; mkdir build; cd build

# Configure rocPRIM, setup options for your system.
# Build options:
#   BUILD_TEST - on by default,
#   BUILD_BENCHMARK - off by default.
#   BENCHMARK_CONFIG_TUNING - off by default. The purpose of this flag to find the best kernel config parameters.
#     At ON the compilation time can be increased significantly.
#   AMDGPU_TARGETS - list of AMD architectures, default: gfx803;gfx900;gfx906.
#     You can make compilation faster if you want to test/benchmark only on one architecture,
#     for example, add -DAMDGPU_TARGETS=gfx906 to 'cmake' parameters.
#
# ! IMPORTANT !
# Set C++ compiler to HCC or HIP-clang. You can do it by adding 'CXX=<path-to-compiler>'
# before 'cmake' or setting cmake option 'CMAKE_CXX_COMPILER' to path to the compiler.
# Using HCC:
[CXX=hcc] cmake -DBUILD_BENCHMARK=ON ../. # or cmake-gui ../.
# or using HIP-clang:
[CXX=hipcc] cmake -DBUILD_BENCHMARK=ON ../.

# Build
make -j4

# Optionally, run tests if they're enabled.
ctest --output-on-failure

# Install
[sudo] make install

Using rocPRIM

Include <rocprim/rocprim.hpp> header:

#include <rocprim/rocprim.hpp>

Recommended way of including rocPRIM into a CMake project is by using its package configuration files. rocPRIM package name is rocprim.

# "/opt/rocm" - default install prefix
find_package(rocprim REQUIRED CONFIG PATHS "/opt/rocm/rocprim")

...

# Includes only rocPRIM headers, HIP libraries have
# to be linked manually by user
target_link_libraries(<your_target> roc::rocprim)

# Includes rocPRIM headers and required HIP dependencies
target_link_libraries(<your_target> roc::rocprim_hip)

Running Unit Tests

# Go to rocPRIM build directory
cd rocPRIM; cd build

# To run all tests
ctest

# To run unit tests for rocPRIM
./test/rocprim/<unit-test-name>

Using custom seeds for the tests

Go to the rocPRIM/test/rocprim/test_seed.hpp file.

//(1)
static constexpr int random_seeds_count = 10;

//(2)
static constexpr unsigned int seeds [] = {0, 2, 10, 1000}; 

//(3)
static constexpr size_t seed_size = sizeof(seeds) / sizeof(seeds[0]);

(1) defines a constant that sets how many passes over the tests will be done with runtime-generated seeds. Modify at will.

(2) defines the user generated seeds. Each of the elements of the array will be used as seed for all tests. Modify at will. If no static seeds are desired, the array should be left empty.

static constexpr unsigned int seeds [] = {}; 

(3) this line should never be modified.

Running Benchmarks

# Go to rocPRIM build directory
cd rocPRIM; cd build

# To run benchmark for warp functions:
# Further option can be found using --help
# [] Fields are optional
./benchmark/benchmark_warp_<function_name> [--size <size>] [--trials <trials>]

# To run benchmark for block functions:
# Further option can be found using --help
# [] Fields are optional
./benchmark/benchmark_block_<function_name> [--size <size>] [--trials <trials>]

# To run benchmark for device functions:
# Further option can be found using --help
# [] Fields are optional
./benchmark/benchmark_device_<function_name> [--size <size>] [--trials <trials>]

Performance configuration

Most of device-wide primitives provided by rocPRIM can be tuned for different AMD device, different types or different operations using compile-time configuration structures passed to them as a template parameter. Main "knobs" are usually size of the block and number of items processed by a single thread.

rocPRIM has built-in default configurations for each of its primitives. In order to use included configurations user should define macro ROCPRIM_TARGET_ARCH to 803 if algorithms should be optimized for gfx803 GCN version, or to 900 for gfx900.

Documentation

# go to rocPRIM doc directory
cd rocPRIM; cd doc

# run doxygen
doxygen Doxyfile

# open html/index.html

hipCUB

hipCUB is a thin wrapper library on top of rocPRIM or CUB. It enables developers to port project that uses CUB library to the HIP layer and to run them on AMD hardware. In ROCm environment hipCUB uses rocPRIM library as the backend, however, on CUDA platforms it uses CUB instead.

Support

Bugs and feature requests can be reported through the issue tracker.

Contributions and License

Contributions of any kind are most welcome! More details are found at CONTRIBUTING and LICENSE.

rocprim's People

Contributors

ex-rzr avatar ajcodes avatar saadrahim avatar vecsmith avatar neon60 avatar aaronenyeshi avatar amdkila avatar vincentsc avatar jszuppe avatar alexbrownamd avatar eidenyoshida avatar mathiasmagnus avatar mhbliao avatar mrburmark avatar pruthvistony avatar yoichiyoshida avatar iotamudelta avatar jerryyin avatar pramenku avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.