Giter Club home page Giter Club logo

Comments (12)

tmcdonell avatar tmcdonell commented on May 27, 2024

I'm currently away on an internship and thus not working on accelerate right now, but if I have a spare moment I'll try to look into it. Sorry about that =

from accelerate.

nushio3 avatar nushio3 commented on May 27, 2024

Hi tmcdonell, thank you for your response. I wish you do a good job on your intern.

By the way, I've managed to write a working simulator, by using Cell (Acc (Array DIM2 Real)) instead of
Acc (Array DIM2 (Cell Real)) as the representation of the state, where type Cell a = ((a,a,a) , (a,a,a) , (a,a,a) , a) .

However, the Accelerate implementation was about 500 times slower than CUDA counterpart, sadly. This is due to my awkwardness in using Accelerate.

Here I prepared a source code to explain you.
https://github.com/nushio3/accelerate-test/blob/fa6e7b3b92e8d2ab357dad26d133c591d5756ef1/step05/OptTest.hs

You can run it like this.

> ./OptTest.hs 1 /dev/null
. . . .
success.

see that, in line 72-73, I have

instance Num AWR where
  a+b = A.use $ run $ A.zipWith (+) a b
  a-b = A.use $ run $ A.zipWith (-) a b

Since A.use . run equals id in semantics, We should be able to remove those. But when I do so:

instance Num AWR where
  a+b = A.zipWith (+) a b
  a-b = A.use $ run $ A.zipWith (-) a b

I get this:

> ./OptTest.hs 1 /dev/null
. . . .
OptTest.hs: 
*** Internal error in package accelerate ***
*** Please submit a bug report at https://github.com/mchakravarty/accelerate/issues
./Data/Array/Accelerate/Smart.hs:886 ((+++)): Precondition violated

Or by doing this:

instance Num AWR where
  a+b = A.use $ run $ A.zipWith (+) a b
  a-b = A.zipWith (-) a b

I get this:

> ./OptTest.hs 1 /dev/null
. . . .
*** Internal error in package accelerate ***
*** Please submit a bug report at https://github.com/mchakravarty/accelerate/issues
./Data/Array/Accelerate/Smart.hs:321 (convertSharingAcc (prjIdx)): inconsistent valuation; sa = 51; env = [57]

The effect of A.use . run compared to id is to force smaller AST, hindering the optimizations.
I guess there are some bugs in optimization routines?

from accelerate.

mchakravarty avatar mchakravarty commented on May 27, 2024

Trevor,

What do you think might be the problem here.

Manuel

Am 14/07/2011 um 22:05 schrieb nushio3:

Hello,
I'm trying to use Accelerate for hydrodynamic simulations.
As a training, I'm writing a Lattice-Boltzmann solver with Accelerate. The program, under construction, is

https://github.com/nushio3/accelerate-test/blob/7a8248fa30c0e728cea0fe03ccd21bf5bed8a5ef/step05/MainAcc.hs

I have expressed what I want to write also in C++ and CUDA. They are
main-omp.cpp and main-cuda.cu at the same folder.

To begin with, I wrote a function to initialize the array in Accelerate,
(it corresponds to the function 'initialize()' in fluid.h)
but it fails with 'submit a bug report' error.

It says 'too many resources requested,' so I looked at the printout of Accelerate's kernel,
but for me it looks normal.
Am I doing something wrong, so that I'm wasting resources?
Or shall I decrease e.g. the resolution?

./MainAcc.hs 0
... some warnings omitted ...
map
(\x0 -> (+) ((+) ((+) ((+) ((+) ((+) ((+) ((+) (2 (3 x0),
1 (3 x0)),
0 (3 x0)),
2 (2 x0)),
1 (2 x0)),
0 (2 x0)),
2 (1 x0)),
1 (1 x0)),
0 (1 x0)))
(generate
(Z :. 1024) :. 768
(\x0 -> ((0.0,0.0,0.0),
(0.1,
0.7,
(+) (0.2,
() (1.0e-3,
(/) ((
) (12.0, fromIntegral (indexHead x0)), 768.0)))),
(0.0,0.0,0.0),
((<) ((+) (() (64.0,
() ((-) (fromIntegral (indexHead (indexTail x0)),
(/) (768.0, 6.0)),
(-) (fromIntegral (indexHead (indexTail x0)),
(/) (768.0, 6.0)))),
(
) ((-) (fromIntegral (indexHead x0), (/) (768.0, 2.0)),
(-) (fromIntegral (indexHead x0), (/) (768.0, 2.0)))),
() ((/) (768.0, 24.0), (/) (768.0, 24.0)))) ?
(1.0, 0.0))))
MainAcc.hs:
*
* Internal error in package accelerate ***
*** Please submit a bug report at https://github.com/mchakravarty/accelerate/issues
./Data/Array/Accelerate/CUDA.hs:59 (unhandled): CUDA Exception: too many resources requested for launch

Reply to this email directly or view it on GitHub:
https://github.com/mchakravarty/accelerate/issues/25

from accelerate.

tmcdonell avatar tmcdonell commented on May 27, 2024

I can not reproduce the first bug report, unfortunately. Specs for my test machine follow; as you can see it is not one of the high-end CUDA cards. Which version of the CUDA toolkit are you using? I haven't tried with the 4.x series yet, so maybe that has something to do with it (for example, if the way device capabilities are reported has changed). I'll test that next...

Prelude Foreign.CUDA.Driver> props =<< device 0
DeviceProperties {deviceName = "GeForce GT 120", computeCapability = 1.1, totalGlobalMem = 268107776, totalConstMem = 65536, sharedMemPerBlock = 16384, regsPerBlock = 8192, warpSize = 32, maxThreadsPerBlock = 512, maxBlockSize = (512,512,64), maxGridSize = (65535,65535,1), maxTextureDim1D = 8192, maxTextureDim2D = (65536,32768), maxTextureDim3D = (2048,2048,2048), clockRate = 1250000, multiProcessorCount = 4, memPitch = 2147483647, textureAlignment = 256, computeMode = Default, deviceOverlap = True, concurrentKernels = False, eccEnabled = False, kernelExecTimeoutEnabled = True, integrated = False, canMapHostMemory = True}

For the second, the program runs without the use . run statements if using @sseefried's patch for issue #22, although I'm not sure of the status of that patch relative to your own changes to sharing recovery.

from accelerate.

nushio3 avatar nushio3 commented on May 27, 2024

Thank you tmcdonell, for your effort. Let me see, the ghci trick
Prelude> :m +Foreign.CUDA.Driver
Prelude Foreign.CUDA.Driver> props =<< device 0
Loading package extensible-exceptions-0.1.1.2 ... linking ... done.
Loading package bytestring-0.9.1.10 ... linking ... done.
Loading package cuda-0.3.2.2 ... linking ... done.
*** Exception: CUDA Exception: driver not initialised
... didn't work for me. I'm using Tesla M2050 (device capability 2.0) with CUDA 3.2 . I'll upload the result of deviceQuery, if you need. I have tried CUDA 4.0 environment, too, but I couldn't install the hackage cuda-0.3.2.2 (which is the latest) into CUDA 4.0 environment.

I'm trying the patch 5c24257 now...

from accelerate.

tmcdonell avatar tmcdonell commented on May 27, 2024

Ah, first you need to run initialise []; sorry for the omission. No matter --- the model number and driver version tell me everything I was interested in. I am also using nvcc version 3.2. Do you happen to be running a 64-bit version of GHC?

I have only done light testing on compute-2.0 devices since I only briefly had access to one. I recall there being some problems when the 2.0 series devices were released; maybe this is why the first example works on my 1.x series card but not your own...

from accelerate.

nushio3 avatar nushio3 commented on May 27, 2024

Thanks, tmcdonell, with initialize I could query the device by props =<< device 0 .
With patch 5c24257 , I could compile the code without use . run . Now benchmarking.

from accelerate.

mchakravarty avatar mchakravarty commented on May 27, 2024

Any progress on this problem?

from accelerate.

nushio3 avatar nushio3 commented on May 27, 2024

Nice to hear from you again! I haven't tried accelerate since ICFP2011, where I was possible to compute what I want in accelerate (but was slow.) Maybe it's a good time for me to touch the lates accelerate again!

from accelerate.

mchakravarty avatar mchakravarty commented on May 27, 2024

Good to hear from you as well. There have been many changes to Accelerate in the last few months. So, it may indeed be worthwhile to have another look.

from accelerate.

tmcdonell avatar tmcdonell commented on May 27, 2024

I'm going to go ahead and close this issue, as both of the example programs work now (it is still slow, but that's a different issue).

Congratulations on your recent release of Paraiso!

from accelerate.

nushio3 avatar nushio3 commented on May 27, 2024

Thank you for your congratulations!

I've been watching that Ryan Newton came in and accelerate is recently
seeing rapid progress. I'd really like to try it again but I've been
having something to do first...

Please keep up the good work!

2012/6/20 Trevor L. McDonell
[email protected]:

I'm going to go ahead and close this issue, as both of the example programs work now (it is still slow, but that's a different issue).

Congratulations on your recent release of Paraiso!


Reply to this email directly or view it on GitHub:
#25 (comment)

Takayuki MURANUSHI
The Hakubi Center for Advanced Research, Kyoto University
http://www.hakubi.kyoto-u.ac.jp/02_mem/h22/muranushi.html

from accelerate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.