Comments (6)
You have highlighted a problem that I think we need to solve with Aparapi to
help with map reduce type issues.
One trick (you may have tried this) is to use the pass # (from
kernel.execute(range,passes) ) to indicate that the final pass should perform
the reduce step. This way the data will be kept on the GPU and not shuffled
back and forth.
So with a Kernel like
k = new Kernel(){
public void run (){
if (getPassId()<19){
// do first 19 (0..18) passes
}else{
// do final reduction pass (19)
}
}
}
k.execute(1024, 20);
The real solution (and I am looking at this now) is some way to allow a Kernel
to have multiple entrypoints. I have yet to come to a clean syntax for this,
so would welcome suggestions.
Original comment by [email protected]
on 28 Oct 2011 at 3:22
from aparapi.
Original comment by [email protected]
on 28 Oct 2011 at 3:23
- Changed state: Accepted
- Added labels: Type-Enhancement
- Removed labels: Type-Defect
from aparapi.
I thought of that, but would that not cause every core to execute both branches
for each pass (without storing the results)? Or is there a special case that
happens when all cores take the same branch?
Original comment by [email protected]
on 7 Nov 2011 at 12:09
from aparapi.
Sorry Kenneth for some reason I missed your question.
Because we use global memory (as far as GPU is concerned) we must wait for all
Kernels to complete. Conceptually they are all running at the same time and we
can never expect one Kernel to see the result of another. Even if we know the
group order. So I think this will always require us to relaunch the kernel.
Relaunch is not *that* expensive, especially if we can avoid moving buffers by
setting setExplicit(true) and taking buffer transfer control ourself.
I just added a 'life' demo (Conways game of life) which executes a Kernel
inside a fairly tight loop and which only pulls the buffer from the GOU when
Swing wishes to display the data. This might help you in your case as well.
Unless I have completely missed the point ;)
Gary
Original comment by [email protected]
on 10 Nov 2011 at 7:30
from aparapi.
I discovered the "localBarrier()" function that seems to do what I was looking
for (synchronize).
I did try the other suggestion (multi-pass) and that seemed to have a heavy
performance penalty, but I abandoned the idea, so not sure what caused it.
Feel free to close this issue.
Original comment by [email protected]
on 27 Dec 2011 at 9:56
from aparapi.
Original comment by [email protected]
on 20 Apr 2013 at 12:30
- Changed state: WontFix
from aparapi.
Related Issues (20)
- Invoking kernels with single boolean fields in kernel causes NoSuchFieldError in JNI
- Problem when running with NVIDIA GPUs HOT 8
- Generating OpenCL
- Patch for /trunk/samples/add/src/com/amd/aparapi/sample/add/Main.java
- fatal error when disposing a 2D float execution kernel HOT 2
- Failed to load aparapi native library
- 2D arrays management HOT 2
- FFT Extension example fails to run HOT 1
- Add support for Intel Xeon Phi
- Trouble running samples in lambda branch on Kaveri HOT 6
- Release in the downloads section is old and no guide on how to compile on Mac HOT 1
- High total processing/running time on GPU mode w/Aparapi HOT 2
- Aparapi can't find OpenCL HOT 21
- Mandel Works Fine With The GPU But, When I Run My Code From BlueJ It Doesn't Work HOT 1
- Can i Run Aparapi on "Nvidia Gpu" or "Intel Gpu"? HOT 4
- Missing Sync or Volatile with Aparapi HOT 2
- Dump modified Java bytecode HOT 4
- Please update the tutorial about HSA settings
- Adding Vector data type at Aparapi HOT 1
- OpenCL compile fails... sometimes? (w/ Processing)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aparapi.