Giter Club home page Giter Club logo

Comments (13)

ptillet avatar ptillet commented on August 23, 2024

This all makes sense! The idea is that I ended up choosing top-5 to increase the prediction accuracy. But if you benchmark multiple kernels that modify one of their inputs, then you need to be careful to copy back the result. Can you confirm that setting N_TOP=1 resolves the issue?

from triton.

listenlink avatar listenlink commented on August 23, 2024

Hi, it still failed when setting N_TOP=1

from triton.

gongzg avatar gongzg commented on August 23, 2024

@ptillet could you reproduce this issue at your BDW machine?

from triton.

ptillet avatar ptillet commented on August 23, 2024

Yes, I will try this tomorrow. I suspect it should come from copying back the output's result. Perhaps a problem with queues. In the meantime, could you try N_TOP=1 and set modify_output = false ?

from triton.

listenlink avatar listenlink commented on August 23, 2024

Sorry, I still failed when setting N_TOP=1 and modify_output = false on my side

from triton.

ptillet avatar ptillet commented on August 23, 2024

Haha. That sounds pretty bad! I'll definitely take a look at it.

from triton.

ptillet avatar ptillet commented on August 23, 2024

Some updates: I've found a bunch of other bugs when trying to reproduce the issue with caffe. I've fixed some things, and I'm tuning ISAAC for intel's latest driver and double-precision on broadwell. I hope to have everything up and running by the end of the week-end.

from triton.

ptillet avatar ptillet commented on August 23, 2024

Could you try the latest master? It should not only fix bugs for clCaffe, but also add double-precision support and performance improvements with the latest Intel OpenCL 2.0 driver.

Caffe seems to work fine with ISAAC on my BROADWELL machine.

from triton.

gongzg avatar gongzg commented on August 23, 2024

@ptillet I tried the latest master with clcaffe. The test suite will not crash now, but still has some failures:

[ FAILED ] SGDSolverTest/2.TestSnapshot, where TypeParam = caffe::GPUDevice
[ FAILED ] SGDSolverTest/2.TestLeastSquaresUpdateWithEverything, where TypeParam = caffe::GPUDevice
[ FAILED ] SGDSolverTest/2.TestLeastSquaresUpdateWithWeightDecayMultiIter, where TypeParam = caffe::GPUDevice
[ FAILED ] SGDSolverTest/2.TestLeastSquaresUpdateWithEverythingAccum, where TypeParam = caffe::GPUDevice
[ FAILED ] AdaGradSolverTest/2.TestSnapshot, where TypeParam = caffe::GPUDevice
[ FAILED ] NesterovSolverTest/2.TestNesterovLeastSquaresUpdateWithEverything, where TypeParam = caffe::GPUDevice
[ FAILED ] AdaDeltaSolverTest/2.TestLeastSquaresUpdateWithEverythingAccum, where TypeParam = caffe::GPUDevice
[ FAILED ] AdaDeltaSolverTest/2.TestSnapshot, where TypeParam = caffe::GPUDevice
[ FAILED ] AdamSolverTest/2.TestSnapshot, where TypeParam = caffe::GPUDevice
[ FAILED ] AdamSolverTest/2.TestAdamLeastSquaresUpdateWithEverything, where TypeParam = caffe::GPUDevice
[ FAILED ] RMSPropSolverTest/2.TestRMSPropLeastSquaresUpdateWithEverything, where TypeParam = caffe::GPUDevice
[ FAILED ] RMSPropSolverTest/2.TestSnapshot, where TypeParam = caffe::GPUDevice
[ FAILED ] InnerProductLayerTest/2.TestGradientTranspose, where TypeParam = caffe::GPUDevice

And if I ran some of the test cases directly, such as :
test/test.testbin --gtest_filter=SGDSolverTest/2.TestLeastSquaresUpdateWithEverything
It may fail or success at random manner. But If I use the old version:

commit 6ac5e1f55b1cae59394758f823d5c58f57ca561d
Author: Philippe Tillet <[email protected]>
Date:   Fri Jan 1 05:44:28 2016 -0500

    Templates/Reduce1D: now properly loading 2D scalars, it always passes all of the float gpu test cases.

It always passes all GPU float type testing.

from triton.

ptillet avatar ptillet commented on August 23, 2024

I see, thanks. I'm also having issues with some tests passing when called individually but failing when running the entire test suite. I'll solve this ASAP.

from triton.

ptillet avatar ptillet commented on August 23, 2024

Sometimes, ISAAC's GEMM uses two kernels. The event returned by clBlasSgemm always corresponded to the first one, which led to some synchronization issues. Setting label=0 solved the problem because it forced the library to use only one kernel. Fixed in f226837.
Could you retry now?

from triton.

gongzg avatar gongzg commented on August 23, 2024

@ptillet Nice catch, and it works great with clcaffe now. Thanks for your quick fix!

from triton.

ptillet avatar ptillet commented on August 23, 2024

That's good to hear :)

from triton.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.