Giter Club home page Giter Club logo

cuda_hook's Introduction

CUDA Hook

Hooked CUDA-related dynamic libraries by using automated code generation tools. Based on this, you can easily obtain the CUDA API called by the CUDA program, and you can also hijack the CUDA API to insert custom logic.

It implements an ingenious tool to automatically generate code that hooks the CUDA api with CUDA native header files, and is extremely practical and extensible.

At present, the hooking of dynamic libraries such as cuda driver, nvml, cuda runtime, cudnn, cublas, cublasLt, cufft, nvtx, nvrtc, curand, cusparse, cusolver, nvjpeg and nvblas has been completed, and it can also be easily extended to the hooking of other cuda dynamic libraries.

Support Dynamic Libraries

  • CUDA Driver: libcuda.so
  • NVML: libnvidia-ml.so
  • CUDA Runtime: libcudart.so
  • CUDNN: libcudnn.so
  • CUBLAS: libcublas.so
  • CUBLASLT: libcublasLt.so
  • CUFFT: libcufft.so
  • NVTX: libnvToolsExt.so
  • NVRTC: libnvrtc.so
  • CURAND: libcurand.so
  • CUSPARSE: libcusparse.so
  • CUSOLVER: libcusolver.so
  • NVJPEG: libnvjpeg.so
  • NVBLAS: libnvblas.so

Compile

Environment

  • OS: Linux
  • Cmake Version: >= 3.12
  • GCC Version: >= 4.8
  • CUDA Version: 11.4 (best)
  • CUDA Driver Version: 470.129.06 (best)
  • CUDNN Version: 7.6.5 (best)

Clone

git clone https://github.com/Bruce-Lee-LY/cuda_hook.git

Build

GTX1080Ti

cd cuda_hook
./build.sh -a 61 -t Release -s ON -b OFF
./build.sh -a 61 -t Debug -s OFF -b ON

Tesla V100

cd cuda_hook
./build.sh -a 70 -t Release -s ON -b OFF
./build.sh -a 70 -t Debug -s OFF -b ON

RTX2080Ti

cd cuda_hook
./build.sh -a 75 -t Release -s ON -b OFF
./build.sh -a 75 -t Debug -s OFF -b ON

NVIDIA A100

cd cuda_hook
./build.sh -a 80 -t Release -s ON -b OFF
./build.sh -a 80 -t Debug -s OFF -b ON

RTX3080Ti / RTX3090 / RTX A6000

cd cuda_hook
./build.sh -a 86 -t Release -s ON -b OFF
./build.sh -a 86 -t Debug -s OFF -b ON

Run Sample

./run_sample.sh

Tools

Code Generate

Use CUDA native header files to automatically generate code that hooks CUDA API.

cd tools/code_generate
./code_generate.sh

cuda_hook's People

Contributors

chenzhuofu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cuda_hook's Issues

BUG: `dlopen("/usr/local/cuda/targets/x86_64-linux/lib/libcublas.so", RTLD_NOW | RTLD_LOCAL)` failed, symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference

When execute tensorflow minist train task, occur the problem, 'Check failed: cublas_handle'.

It caused by dlopen, the complete command is dlopen("/usr/local/cuda/targets/x86_64-linux/lib/libcublas.so", RTLD_NOW | RTLD_LOCAL). And error throwed by dlopen is 'Failed to open /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so: /usr/local/cuda/targets/x86_64-linux/lib/libcublas.so: symbol free_gemm_select version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference'.

According to ldd and nm, libcublas.so depend on libcublasLt.so.11, which linked to '/home/chenqian/Code/cuda_hook/output/lib64/libcublasLt.so.11'. And, there is no symbol free_gemm_select in both '/home/chenqian/Code/cuda_hook/output/lib64/libcublasLt.so.11' and '/usr/local/cuda/targets/x86_64-linux/lib/libcublasLt.so.11'.

Moreover, if without cuda hook, the train task can complete.

Screenshot 2024-04-23 at 2 28 40 PM Screenshot 2024-04-23 at 1 58 17 PM Screenshot 2024-04-23 at 1 58 54 PM Screenshot 2024-04-23 at 2 02 45 PM

Chat

Hello, I am learning cuda interception recently, can you tell me your contact information for convenience

It doesn't work with the newer driver and CUDA

It seems that it works with CUDA 11.6 and fails to work with CUDA 11.8 with Driver Version 525.85.12. Could you please fix it?

The following is the error message.
$ more log/sample/nvml/nvml_example.log
[CUDA-HOOK 71169:71169 hook.h:33 GetNVMLSymbol] check failed: nvml_handle

$ more matrix_mul.log
/home/ps/cuda_hook_retry/output/sample/cuda/matrix_mul: /home/ps/cuda_hook_retry/output/lib64/libcudart.so.11.0: no version information available (required by /home/ps/cuda
_hook_retry/output/sample/cuda/matrix_mul)
[CUDA-HOOK 71167:71167 hook.h:29 GetCUDASymbol] check failed: cuda_handle
[Matrix Multiply Using CUDA] - Starting...

no version information available

There is a version issue.
If I run nvidia-smi in the cuda_hook/output/lib64 directory, it returns the following error.
"Mismatch in versions between nvidia-smi and NVML.
Are you sure you are using nvidia-smi provided with the driver?"

In addition, the bandwidth_test shows the following error: cuda_hook/output/lib64/libcudart.so.11.0: no version information available (required by bandwidth_test)

[Bug] code_generate has bug when parse parameter with *array type* like "int x[]"

Description

I found that code_generate has bug when parse parameter with array type like "int x[]".

Reproduction

You can reproduce the bug using header file below:

void test_with_brackets_1(int x[]);

void test_with_brackets_2(int x[][5]);

The output of code_generate.py would be"

// auto generate 2 apis

#include "test_subset.h"
#include "hook.h"
#include "macro_common.h"
#include "trace_profile.h"

HOOK_C_API HOOK_DECL_EXPORT void test_with_brackets_1(int x) {
    HOOK_TRACE_PROFILE("test_with_brackets_1");
    using func_ptr = void (*)(int);
    static auto func_entry = reinterpret_cast<func_ptr>(HOOK_TEST_SYMBOL("test_with_brackets_1"));
    HOOK_CHECK(func_entry);
    return func_entry(x);
}

HOOK_C_API HOOK_DECL_EXPORT void test_with_brackets_2(int x) {
    HOOK_TRACE_PROFILE("test_with_brackets_2");
    using func_ptr = void (*)(int);
    static auto func_entry = reinterpret_cast<func_ptr>(HOOK_TEST_SYMBOL("test_with_brackets_2"));
    HOOK_CHECK(func_entry);
    return func_entry(x);
}

Why was cuGetProcAddress manually deleted?

Recently, I have been studying this project and noticed that you manually removed the interception of cuGetProcAddress. May I ask why?

// manually delete
// HOOK_C_API HOOK_DECL_EXPORT CUresult cuGetProcAddress(const char *symbol, void **pfn, int cudaVersion,
// cuuint64_t flags) {
// HOOK_TRACE_PROFILE("cuGetProcAddress");
// using func_ptr = CUresult (*)(const char *, void **, int, cuuint64_t);
// static auto func_entry = reinterpret_cast<func_ptr>(HOOK_CUDA_SYMBOL("cuGetProcAddress"));
// HOOK_CHECK(func_entry);
// return func_entry(symbol, pfn, cudaVersion, flags);
// }

ports to cuda 12.

Hi,

after code_generation for 12.3
build.sh

I got matex_mul:

CUDA error at /opt/cuda_hook/sample/cuda/include/helper_cuda.h:739 code=36(cudaErrorCallRequiresNewerDriver) "cudaGetDeviceCount(&device_count)"

[root@xxxxbuild]# strings ./* | grep -i versi |grep 12
strings: Warning: './CMakeFiles' is a directory
CUDA_VERSION:STRING=12.3

Is this problem relate to cuda_subset.h ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.