Comments (7)
There is going to be a change soon, which will introduce multiple streams in order to overlap packing and sending with the interior computation. If you are using MPI, I guess you compile from source. I'll update this issue, once that change has been implemented, so you can re do your performance tests.
from rocalution.
Currently, rocALUTION does not support direct GPU-GPU communication. This is something we will be adding in a future release, however.
from rocalution.
Thanks for you quick answer.
Do you have an idea of when this feature will be supported ?
from rocalution.
I am not able to give you a specific release. May I ask why you are interested in such feature? I do not expect any performance improvement on most solvers, as time critical algorithms such as SpMV can fully hide communication by computation.
from rocalution.
I understand your point but the bench case we are working on shows that about 10% of the time of each rocALUTION solver call (1 per time step) is dedicated to hipMemcpy for these ghost cells exchange, with almost no computational kernels overlap. We use the CG solver with the c-amg preconditionner. It would have been nice to try this feature on AMD nodes specifically designed to do so. We will test it when it will be available :).
from rocalution.
Sounds a good feature to try, thanks !
from rocalution.
I have merged some major modifications to GlobalMatrix::Apply()
in order to overlap communication and packing with interior matrix vector multiplication. This should slightly improve SpMV performance and scale linear with additional nodes / GPUs. Let me know, if this fixes the issue you have been observing.
It will still use the host for communication, but you should not see any communication related slow downs.
Note, that you will have to checkout develop
branch, in order to see those changes.
from rocalution.
Related Issues (20)
- RPATH is missing from ROCm 5.0.0 release HOT 1
- issues being closed without fixing why???
- How to use a BCSR matrix? HOT 6
- Please enable two factor authentication in your github account
- Error while building rocALUTION for rocm5.1.x HOT 14
- Global HOT 8
- ruge_stueben_amg.hpp does not compile with hipcc (clang-14) HOT 3
- Performance issue with cg_mpi running on 2 GPU or more HOT 3
- ifdef SUPPORT_MULTINODE HOT 2
- rocALUTION for fortran HOT 12
- Problem with distribute_matrix in common.hpp HOT 2
- Directly update (set/add) LocalMatrix accelerator values HOT 6
- Files tagged with HIP_SOURCE_PROPERTY_FORMAT generate wrong HIP_CLANG_PATH HOT 6
- rocALUTION build following ROCm versions HOT 5
- readthedocs.io documentation missing HOT 3
- PairwiseAMG crash in parallel HOT 2
- Sample cg-rsamg_mpi crashes in parallel (RugeStuebenAMG) HOT 1
- HIP error: invalid device function HOT 1
- [Clarification Needed] Accessing linear systems data already stored on GPU(s) in rocThrust data structures HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rocalution.