Giter Club home page Giter Club logo

Comments (5)

vsmaier avatar vsmaier commented on September 28, 2024

I am actually a little bit puzzled. I was repeatedly calling a 600,000 x 100 distance calculation in a function not using rm() nor using gc()

stressme <- function(){
  iter <- 0
  repeat {
    res <- distance(gpublocks,gpufacilities, method="sqEuclidean")
    result <- res[,]
    iter <- iter + 1
    if (iter > 2000){
      break
    }
  print(iter)
  }
  return("OK")
}

This morning this code stalled the machine after 3 iterations. What happens is that the GUI becomes unresponsive to any interaction. The mouse pointer still moves around but no mouse actions can be triggered, nor does the system accept keyboard inputs. This roughly matched my prior experiences.

Since then I did some additional processing and decided to run the loop with and without rm(), gc() calls. Turns out that it worked in either case. I completed a run with 2000 iterations without problems. I did notice though that the GUI became sluggish. Doing secondary things on the machine became a pain.

I then decided to do some speed comparisons with pdist and regular matrices, not distance and gpuDistances. Used system.time() for 10 iterations, then decided to fire up the gpuR above for 10 iterations as well. Stalled right after the first iteration. Had to reboot.

Only thing in the log I found so far:

Mar 1 16:05:03 Turbine watchdogd[256]: [watchdog_daemon] @(_wd_daemon_service_thread) - service (com.apple.WindowServer) reported as unresponsive
Mar 1 16:05:08 Turbine Safari[792]: tcp_connection_tls_session_error_callback_imp 12 __tcp_connection_tls_session_callback_write_block_invoke.434 error 22
Mar 1 16:05:08 Turbine spindump[394]: Saved userspace_watchdog_timeout.spin report for WindowServer version ??? (???) to /Library/Logs/DiagnosticReports/WindowServer_2016-03-01-160508_Turbine.userspace_watchdog_timeout.spin
Mar 1 16:05:08 Turbine watchdogd[256]: [watchdog_daemon] @(__wd_service_report_unresponsive_block_invoke) - spindump gathered for (com.apple.WindowServer) at (/Library/Logs/DiagnosticReports/WindowServer_2016-03-01-160508_Turbine.userspace_watchdog_timeout.spin)
Mar 1 16:05:28 Turbine watchdogd[256]: [watchdog_daemon] @(_wd_daemon_service_thread) - service (com.apple.WindowServer) reported as unresponsive
Mar 1 16:05:33 Turbine spindump[394]: Saved userspace_watchdog_timeout.spin report for WindowServer version ??? (???) to /Library/Logs/DiagnosticReports/WindowServer_2016-03-01-160533_Turbine.userspace_watchdog_timeout.spin
Mar 1 16:05:33 Turbine watchdogd[256]: [watchdog_daemon] @(__wd_service_report_unresponsive_block_invoke) - spindump gathered for (com.apple.WindowServer) at (/Library/Logs/DiagnosticReports/WindowServer_2016-03-01-160533_Turbine.userspace_watchdog_timeout.spin)

So I am leaning now towards assuming either a hardware defect or a OS problem. I'll see if I can find some other Mac user to replicate this. I'll also run the dist function in a loop to see if that is affected as well. Maybe that will offer any new clues. I'll also run loops with rm() and gc() so see if the problem gets "fixed" that way.

One more question - what size matrices could be reasonably expected to compute successfully and in what time? I did notice some sharp dropoff in performance per distance computed using 600,000 x 100 vs 100,000 x 10.

from gpur.

vsmaier avatar vsmaier commented on September 28, 2024

At this point I have been able to run code like this

stressme <- function(){
  iter <- 0
  repeat {
    gpublocks <- gpuR::vclMatrix( as.matrix(  mblock[sample(1:nrow(mblock), 500000,replace=FALSE),] ))
    gpufacilities <- gpuR::vclMatrix( as.matrix(  mblock[sample(1:nrow(mblock), 100,replace=FALSE),] ))
    res <- distance(gpublocks,gpufacilities, method="sqEuclidean")
    result <- res[,]
    #rm(res)
    #gc()
    #rm(result)
    #gc()
    #rm(gpublocks)
    #gc()
    #rm(gpufacilities)
    #gc
    iter <- iter + 1
    if (iter > 1000){
      break
    }
  print(iter)
  }
  return("OK")
}

for hours at a time with and without the comments around garbage collection. And the only issue I have encountered was this message upon interrupting the code execution:

 Error in gpuR::vclMatrix(as.matrix(mblock[sample(1:nrow(mblock), 5e+05,  : 
  error in evaluating the argument 'data' in selecting a method for function 'vclMatrix': Error in base::try(res, TRUE) : object 'res' not found

Again, when not removing objects with rm and calling gc() there is definite impact on GUI responsiveness while executing the loop.

The best tool I have found to monitor the GPU behavior is iStat. From what I can tell the gpuR code is not surprisingly executed on the GPU running the display. Down the road I'll see if I can mess with this.

One odd behavior was that allocating several large matrices using

gpuR::vclMatrix()

initially showed memory decrease on the GPU. But then memory use seemed to stay constant, indicative of some sort of buffer using main memory. Again, all I have is the iStat report on this.

So all of this may be pointing back to some weird platform issue in my setup.

Regarding size of matrices - I do realize that what I am trying to do is really pushing it. Ideally I would like to run 11,000,000 x (small number, possibly 1 only) distances, keeping the larger in GPU memory. Therefore the tests with around 600,000 x 10 pairs. I actually do not want to work with all the distances, only those falling under a certain small threshold. The idea is to see how this performs compared to traditional spatial indexing.

Maybe it is time to close this issue? At least until more concrete evidence of a bug in gpuR comes up again?

from gpur.

cdeterman avatar cdeterman commented on September 28, 2024

@vsmaier I see now. I was mistaken in thinking you were doing all pairwise comparisons for the 600,000 matrix. It shouldn't be a problem with the comparisons between the 600K and 100 objects. I have made some minor changes to more efficiently handle temporary objects within the dist/distance functions. However, you still will likely need to use the rm() and gc() functions.

I have duplicated the memory issue using your stressme function. My GPU quickly runs out of memory after 3 or 4 iterations. The distance computation is simply processing too fast for the R garbage collector to run in between runs. Explicitly removing the object and calling garbage collection frees up the GPU memory. I have confirmed this using watch nvidia-smi (specific to NVIDIA GPUs memory monitoring) on my Ubuntu system. However, once I add in the rm() and gc() it runs without problem through 2000 iterations.

stressme <- function(){
    iter <- 0
    repeat {
        res <- distance(gpublocks,gpufacilities, method="sqEuclidean")
        result <- res[,]
        rm(res)
        gc()
        iter <- iter + 1
        if (iter > 2000){
            break
        }
        print(iter)
    }
    return("OK")
}

The turbine errors are however outside my expertise. I am primarily a linux guy. I work with Macs and try to make sure there is support but that is the extent of my knowledge. That is something you will need to troubleshoot elsewhere. Would be happy to hear about any solution though.

Let me know if the most recent changes work (I have only applied them to vclMatrix classes ATM). If this fixes things I will begin a few more updates.

Also, with AMD you may be able to check GPU memory usage with something like aticonfig --odgc --odgt which I found here. Just a thought to perhaps help.

from gpur.

cdeterman avatar cdeterman commented on September 28, 2024

@vsmaier did this address your problem? If I don't hear back I will assume the issue has been addressed and I will close this issue as it is fully operational on my end.

from gpur.

vsmaier avatar vsmaier commented on September 28, 2024

Yes, please close the issue. Forcing collection seems to be sufficient.

Thanks

from gpur.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.