Giter Club home page Giter Club logo

Comments (15)

chaoming0625 avatar chaoming0625 commented on June 15, 2024 2

Thank you @kmaehashi @leofang . Currently, I am using the pointer in RawKernel.kernel.ptr just as @leofang pointed out. However, I also agree the suggestion of @kmaehashi is right.

The motivation for my question is to use cupy as a compiler to compile custom CUDA extensions on JAX. JAX's jit system needs to register an XLA custom call when using customized Cuda kernels. Usually, we need to write Cuda code, pre-compile it, bind it to Python, and register kernels in XLA. To remove such a complex process, we can directly compile the source code (as a Python code) at the Python level, then get the compiled kernel, throw it into the custom call, and all things are compatible with jax's jit system, with the minimal efforts (only writing the Python string).

Currently, we are working on this functionality.

from cupy.

kmaehashi avatar kmaehashi commented on June 15, 2024 1

Are there any specific reasons to use CuPy for that purpose? If your C++ application needs to compile CUDA code on the fly, you can just call NVRTC to get cubin/ptx.

from cupy.

takagi avatar takagi commented on June 15, 2024

You cannot launch a kernel defined by cupy.RawKernel, however, an option may be using cupy.RawModule that can be used to load a .cubin or .ptx file. Does it fit?
https://docs.cupy.dev/en/stable/reference/generated/cupy.RawModule.html

from cupy.

chaoming0625 avatar chaoming0625 commented on June 15, 2024

So a great answer! Therefore, the key is to use RawModule to generate a .cubin or .ptx file, then I load the generated under the c++ backend to run. Am I right?

from cupy.

chaoming0625 avatar chaoming0625 commented on June 15, 2024

Moreover, can cupyx.jit.rawkernel compiled kernels to be saved into a .ptx file?

from cupy.

takagi avatar takagi commented on June 15, 2024

What I meant was the opposite. You write in C++ (.cu file) and compile it into .cubin or .ptx files, then you can use them from the RawModule. cupyx.jit.rawkernel doesn't have a feature to save .ptx file in a way that is easily usable from external programs.

from cupy.

chaoming0625 avatar chaoming0625 commented on June 15, 2024

Thanks for the explanation. I am wondering how to get the compiled binary when using cupy.RawKernel?

from cupy.

takagi avatar takagi commented on June 15, 2024

Please supply a path to the compiled binary to path argument of RawModule (not RawKernel). https://docs.cupy.dev/en/stable/reference/generated/cupy.RawModule.html

from cupy.

chaoming0625 avatar chaoming0625 commented on June 15, 2024

I try to use cupy to compile the cuda code, and get its compiled kernel, rather than providing the path of a compiled CUDA binary (*.cubin) or a PTX file. So, I am wondering how to provide the cuda source code and then get the cupy compiled binary file path?

from cupy.

chaoming0625 avatar chaoming0625 commented on June 15, 2024

Or, how can I get the kernel under $HOME/.cupy/kernel_cache/ directory? The name of a .cubin file seems have no pattern?

from cupy.

leofang avatar leofang commented on June 15, 2024

In theory, for a given RawKernel (either you use it directly, or get it via RawModule.get_function) you can retrieve the CUFunction pointer via RawKernel.kernel.ptr, but

  1. This is not public API
  2. This is untested

It's unclear to me either why you'd need this, @chaoming0625 could you elaborate?

from cupy.

chaoming0625 avatar chaoming0625 commented on June 15, 2024

Moreover, can I get the pointer of the function after compiling through cupyx.jit.rawkernel?

from cupy.

leofang avatar leofang commented on June 15, 2024

Thanks for sharing your use case @chaoming0625, this is very interesting!

Would you be able to point us how you use this capability to make CuPy and Jax interoperable at the kernel level? I would love to see how it allows you to avoid writing complex boilerplate code. Eventually, I would like to learn how to craft a small interop demo like the one we showed for PyTorch-CuPy:
https://docs.cupy.dev/en/stable/user_guide/interoperability.html#using-custom-kernels-in-pytorch
If you already have a small demo that we can copy/paste to the document that's even better! 😄

Moreover, can I get the pointer of the function after compiling through cupyx.jit.rawkernel?

Right now it's not public API either, but according to the internal implementation (subject to change)

kern, enable_cg = self._cache.get((in_types, device_id), (None, None))

it is possible to get the Function object from jit.rawkernel._cache once instantiated (it's the key value of the cache). Then, you can get the CUFunction pointer via Function.ptr as before.

If you show us your workflow as I ask above, it'll help us stabilize the interface and expose these features properly. Thanks!

from cupy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.