Giter Club home page Giter Club logo

Comments (7)

tonyrog avatar tonyrog commented on July 20, 2024

If I remember correctly there is a bit of scheduling overhead involved with dirty nifs?
What about a selective approach for nifs that are problematic? ( Or are the majority of the allocation causing problems? )
Possibly using enif_system_info to get the dirty_scheduler_support and then some compile time macro to check if we can actually call enif_schedule_nif?

from cl.

arpieb avatar arpieb commented on July 20, 2024

Yeah, there is some overhead, but it's measured in nanoseconds on modern hardware best I can tell. The "yielding NIF" approach is probably not going to be very tractable as most of the NIF functions map pretty much 1:1 to the OpenCL functions with the exception of unpacking terms.

I'll build some test cases where I can isolate and time the individual functions on thousands of runs on all three boxes and report back. There was a great presentation a couple years ago at ElixirConf US where they were performing timing studies on different NIF-handling approaches, maybe I can find the test scaffolding for that somewhere.

It's possible that it's only a subset of problem children, and also that it might be an OpenCL driver-vendor issue. After all, my MacPro is Mid-2010 vintage and I can't imagine that the bus between RAM and the GPU on it is that much faster than an i7-7800X with server-speed RAM and Nvidia Pascal GPUs...

from cl.

tonyrog avatar tonyrog commented on July 20, 2024

I could wrap the nif table entries with something like:
//-------------------------------
#if (ERL_NIF_MAJOR_VERSION > 2) || ((ERL_NIF_MAJOR_VERSION == 2) && (ERL_NIF_MINOR_VERSION >= 12))
//#define NIF_FUNC(name,arity,fptr) {(name),(arity),(fptr),(ERL_NIF_DIRTY_JOB_CPU_BOUND)}
#define NIF_FUNC(name,arity,fptr) {(name),(arity),(fptr),(0)}
#elif (ERL_NIF_MAJOR_VERSION > 2) || ((ERL_NIF_MAJOR_VERSION == 2) && (ERL_NIF_MINOR_VERSION >= 7))
#define NIF_FUNC(name,arity,fptr) {(name),(arity),(fptr),(0)}
#else
#define NIF_FUNC(name,arity,fptr) {(name),(arity),(fptr)}
#endif
//-------------------------------

This way it would be fairly easy to switch to an all dirty nif approach, if it turns out
that the overhead is ok. Or at least allow switch to dirty nif for any one that wants
to compile using -DUSE_DIRTY_SCHEDULER flag ?
Perhaps even have a NIF_DIRTY_FUNC entry that is backward compatible?

from cl.

tonyrog avatar tonyrog commented on July 20, 2024

And just to clarify. My idea with using enif_schedule_nif was not meant to break up a the nif in several pieces, but rather a way to dynamically decide when to run a nif on a dirty secheduler. The idea is to have one entry point ( like: cl:create_image/5 ) in the NIF say ecl_create_image_dyn you can check parameters to see if you want to call create_image/5 indirectly by using enif_scheduler_nif with ERL_NIF_DIRTY_JOB_CPU_BOUND flag or just call ecl_crate_image directly.

from cl.

tonyrog avatar tonyrog commented on July 20, 2024

I prepared the nif table so you can switch between dirty and non dirty. Also added example cl:noop_/0 which is dynamic dirty and and cl:dirty_noop/0 that is always dirty (if supported). You can find a small simple benchmark in test/cl_noop that check the call overhead.

from cl.

arpieb avatar arpieb commented on July 20, 2024

Nice, thanks! Once I finish up the 1.2 wrappers, docs and unit tests I'll take a swing at this.

from cl.

arpieb avatar arpieb commented on July 20, 2024

OK, just forked your latest to play around with dirty scheduler support and timings. So far I've made one tiny change to c_src/Makefile to allow USE_DIRTY_SCHEDULER to be set from the environment when being included as a dependency:

ifeq ($(USE_DIRTY_SCHEDULER), 1)
  $(info Compiling with support for dirty schedulers)
  CFLAGS += -DUSE_DIRTY_SCHEDULER
endif

I'll keep you posted on what I find out. It might wind up being a compile directive that will be enabled only for certain projects that know they are going to spend a lot of time in OpenCL calls...

from cl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.