Comments (3)
GDRCopy uses CPU to push/pull data from GPU. It has lower latency compared with using GPU's copy engine, which is used by cuMemcpy
(with some exception). However, CPU is generally not as good as the copy engine for moving large data.
To clarify, GDRCopy is not a replacement of cuMemcpy
. The most efficient way is to use GDRCopy for moving small data and cuMemcpy
for large data. You will need to find the cross point of these two on each platform you want to run your applications so that you know what sizes you should use GDRCopy.
from gdrcopy.
Hi @nkflash,
GDRCopy is designed for small message sizes. Based on what you have shown, GDRCopy is faster than cuMemcpy
in that regime. Can you elaborate on the issue?
Also, you may want to run the application with numactl -C <cpus> -l
or numactl -N <nodes> -l
. It will perform better if you pin it to the CPU core that is close to the GPU in the PCIe tree. The -l
will make sure that the host buffer allocation is from that numa node or CPU core.
from gdrcopy.
Hi @nkflash,
GDRCopy is designed for small message sizes. Based on what you have shown, GDRCopy is faster than
cuMemcpy
in that regime. Can you elaborate on the issue?Also, you may want to run the application with
numactl -C <cpus> -l
ornumactl -N <nodes> -l
. It will perform better if you pin it to the CPU core that is close to the GPU in the PCIe tree. The-l
will make sure that the host buffer allocation is from that numa node or CPU core.
Thank you for your quick response.
Sure, I see the small size is better than large size. In this case I didn't bind special CPU. Is that(bind to special cpu) comparable to 'cuMemcpy' in large size?
BTW: Why gdrcopy only suitable for small message size?
from gdrcopy.
Related Issues (20)
- Facing issue when installing HOT 1
- Ubuntu 22 - dpkg: error processing package gdrdrv-dkms:amd64 (--install) during installation of gdrcopy HOT 3
- Why D2H is relatively slower? HOT 2
- Query: Confusion about sudo requirement HOT 3
- thinking about working with CUDA async API
- gdrcopy_sanity failed when GPU Compute Mode is set to EXCLUSIVE HOT 1
- Unable to compile GDRCOPY v2.4 HOT 2
- Minimal steps to install gdrdrv driver only please HOT 6
- Fail to access mapped memory from CPU side(Fail data_validation tests) HOT 14
- tests build failing when check.h is not available HOT 1
- How to understand the file "nv-p2p-dummpy.c" HOT 3
- Driver flavor detection fails for 545 series HOT 2
- GDRCopy 2.4 on Centos7 failing build of RPM packages HOT 2
- Increasing utilization - gdrcopy_copybw HOT 3
- Improve the error report of gdrcopy_pplat when the CUDA kernel cannot be launched
- Safe Mounting of /dev/gdrdrv in a kubernetes environment - HostPath appears to fail HOT 12
- How to effectively test if gdrcopy is enabled using Real world ML workload ? HOT 2
- Can't make with Intel Compiler HOT 4
- MAINT: gdr_unmap segfault on master branch via NVSHMEM 2.10.1 on Cray Slingshot 11 with cuFFTMp HOT 22
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gdrcopy.