Giter Club home page Giter Club logo

Comments (4)

merthidayetoglu avatar merthidayetoglu commented on August 23, 2024

"Recent versions of MPI include an alternate interface that uses MPI_Count as the type for numbers of elements. In C, these have a _c appended to the name. For example,

int MPI_Send_c(const void *buf, MPI_Count count, MPI_Datatype datatype,
int dest, int tag, MPI_Comm comm)

The intent of this interface is to handle exactly this issue.", Bill Gropp

This solution worked seamlessly by replacing MPI_Isend -> MPI_Isend_c and MPI_Irecv -> MPI_Irecv_c.

CommBench/comm.h

Lines 700 to 703 in 6d51166

for (int send = 0; send < numsend; send++)
MPI_Isend_c(sendbuf[send] + sendoffset[send], sendcount[send] * sizeof(T), MPI_BYTE, sendproc[send], 0, comm_mpi, &sendrequest[send]);
for (int recv = 0; recv < numrecv; recv++)
MPI_Irecv_c(recvbuf[recv] + recvoffset[recv], recvcount[recv] * sizeof(T), MPI_BYTE, recvproc[recv], 0, comm_mpi, &recvrequest[recv]);

As a test, below shows micro-benchmarking of point-to-point bandwidth across two nodes of Aurora. Message size is chosen as 16 GB.

******************** MPI COMMUNICATOR IS CREATED
printid: 0 Create Bench 0 with 24 processors
  Port: SYCL, Library: dummy
printid: 0 Create Bench 1 with 24 processors
  Port: SYCL, Library: MPI
Bench 1 comm 0 (0->12) sendbuf 0xff00000000200000 sendoffset 0 recvbuf 0xff000003b9e00000 recvoffset 0 count 16000000000 (16.0000 GB) MPI

CommBench 1: MPI communication matrix (reciever x sender) nnz: 1
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
1 . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . . . . . . . . . 
send footprint: 16000000000 16.0000 GB
recv footprint: 16000000000 16.0000 GB

5 warmup iterations (in order):
startup 2.20e+04 warmup: 7.27e+05
startup 6.82e+01 warmup: 6.88e+05
startup 6.52e+01 warmup: 6.88e+05
startup 6.42e+01 warmup: 6.88e+05
startup 6.38e+01 warmup: 6.88e+05
10 measurement iterations (sorted):
start: 6.2896e+01 time: 6.8806e+05 -> min
start: 6.3082e+01 time: 6.8813e+05
start: 6.3186e+01 time: 6.8814e+05
start: 6.3288e+01 time: 6.8818e+05
start: 6.3303e+01 time: 6.8826e+05
start: 6.3381e+01 time: 6.8834e+05 -> median
start: 6.3384e+01 time: 6.8836e+05
start: 6.3557e+01 time: 6.8840e+05
start: 6.3594e+01 time: 6.8842e+05
start: 6.3855e+01 time: 6.8872e+05 -> max

data: 16.0000 GB
minTime: 6.8806e+05 us, 4.3004e+01 ms/GB, 2.3254e+01 GB/s
medTime: 6.8834e+05 us, 4.3021e+01 ms/GB, 2.3244e+01 GB/s
maxTime: 6.8872e+05 us, 4.3045e+01 ms/GB, 2.3231e+01 GB/s
avgTime: 6.8830e+05 us, 4.3019e+01 ms/GB, 2.3246e+01 GB/s

from commbench.

merthidayetoglu avatar merthidayetoglu commented on August 23, 2024

The same code with _c API did not compile with the default MPI version on Frontier.

merth@frontier08929:~/HiCCL/collectives> make
CC -std=c++14 -fopenmp -I/opt/rocm-5.3.0/include -D__HIP_ROCclr__ -D__HIP_ARCH_GFX90A__=1 -x hip -O3  main.cpp -c -o main.o -craype-verbose
clang++ -D__HIP_PLATFORM_AMD__ -D__HIP_PLATFORM_HCC__ --offload-arch=gfx90a -D__HIP_ARCH_GFX90A__=1 -dynamic -D__CRAY_X86_TRENTO -D__CRAY_AMD_GFX90A -D__CRAYXT_COMPUTE_LINUX_TARGET --gcc-toolchain=/opt/cray/pe/gcc/10.3.0/snos -isystem /opt/cray/pe/cce/15.0.0/cce-clang/x86_64/lib/clang/15.0.2/include -isystem /opt/cray/pe/cce/15.0.0/cce/x86_64/include/craylibs -std=c++14 -fopenmp -I/opt/rocm-5.3.0/include -D__HIP_ROCclr__ -D__HIP_ARCH_GFX90A__=1 -x hip -O3 main.cpp -c -o main.o -I/opt/cray/pe/libsci/22.12.1.1/CRAY/9.0/x86_64/include -I/opt/cray/pe/mpich/8.1.23/ofi/cray/10.0/include -I/opt/cray/pe/dsmml/0.2.2/dsmml//include -I/opt/cray/pe/pmi/6.1.8/include -I/opt/cray/xpmem/2.6.2-2.5_2.22__gd067c3f.shasta/include 
In file included from main.cpp:19:
In file included from ./../hiccl.h:24:
In file included from ./../CommBench/commbench.h:143:
./../CommBench/comm.h:701:11: error: use of undeclared identifier 'MPI_Isend_c'
          MPI_Isend_c(sendbuf[send] + sendoffset[send], sendcount[send] * sizeof(T), MPI_BYTE, sendproc[send], 0, comm_mpi, &sendrequest[send]);
          ^
./../source/comm.h:190:34: note: in instantiation of member function 'CommBench::Comm<unsigned long>::start' requested here
            commandptr[i]->comm->start();
                                 ^
./../source/bench.h:17:10: note: in instantiation of member function 'HiCCL::Comm<unsigned long>::run' requested here
    comm.run();
         ^
main.cpp:174:12: note: in instantiation of function template specialization 'HiCCL::measure<unsigned long>' requested here
    HiCCL::measure<Type>(warmup, numiter, count * numproc, coll);
           ^
In file included from main.cpp:19:
In file included from ./../hiccl.h:24:
In file included from ./../CommBench/commbench.h:143:
./../CommBench/comm.h:703:11: error: use of undeclared identifier 'MPI_Irecv_c'
          MPI_Irecv_c(recvbuf[recv] + recvoffset[recv], recvcount[recv] * sizeof(T), MPI_BYTE, recvproc[recv], 0, comm_mpi, &recvrequest[recv]);
          ^
2 errors generated when compiling for gfx90a.
make: *** [Makefile:17: main.o] Error 1
merth@frontier08929:~/HiCCL/collectives> 

from commbench.

merthidayetoglu avatar merthidayetoglu commented on August 23, 2024

Temporary fix as below:

CommBench/comm.h

Lines 290 to 296 in 5fe5914

int max = 2e9 / sizeof(T);
while(count > max) {
add(sendbuf, sendoffset, recvbuf, recvoffset, max, sendid, recvid);
count = count - max;
sendoffset += max;
recvoffset += max;
}

Test across nodes on Frontier:

******************** MPI COMMUNICATOR IS CREATED
printid: 0 Create Bench 0 with 16 processors
  Port: HIP, Library: dummy
printid: 0 Create Bench 1 with 16 processors
  Port: HIP, Library: MPI
Bench 1 comm 0 (0->8) sendbuf 0x7ff726c00000 sendoffset 0 recvbuf 0x7fefb3400000 recvoffset 0 count 500000000 (2.0000 GB) MPI
Bench 1 comm 1 (0->8) sendbuf 0x7ff726c00000 sendoffset 500000000 recvbuf 0x7fefb3400000 recvoffset 500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 2 (0->8) sendbuf 0x7ff726c00000 sendoffset 1000000000 recvbuf 0x7fefb3400000 recvoffset 1000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 3 (0->8) sendbuf 0x7ff726c00000 sendoffset 1500000000 recvbuf 0x7fefb3400000 recvoffset 1500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 4 (0->8) sendbuf 0x7ff726c00000 sendoffset 2000000000 recvbuf 0x7fefb3400000 recvoffset 2000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 5 (0->8) sendbuf 0x7ff726c00000 sendoffset 2500000000 recvbuf 0x7fefb3400000 recvoffset 2500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 6 (0->8) sendbuf 0x7ff726c00000 sendoffset 3000000000 recvbuf 0x7fefb3400000 recvoffset 3000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 7 (0->8) sendbuf 0x7ff726c00000 sendoffset 3500000000 recvbuf 0x7fefb3400000 recvoffset 3500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 8 (0->8) sendbuf 0x7ff726c00000 sendoffset 4000000000 recvbuf 0x7fefb3400000 recvoffset 4000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 9 (0->8) sendbuf 0x7ff726c00000 sendoffset 4500000000 recvbuf 0x7fefb3400000 recvoffset 4500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 10 (0->8) sendbuf 0x7ff726c00000 sendoffset 5000000000 recvbuf 0x7fefb3400000 recvoffset 5000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 11 (0->8) sendbuf 0x7ff726c00000 sendoffset 5500000000 recvbuf 0x7fefb3400000 recvoffset 5500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 12 (0->8) sendbuf 0x7ff726c00000 sendoffset 6000000000 recvbuf 0x7fefb3400000 recvoffset 6000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 13 (0->8) sendbuf 0x7ff726c00000 sendoffset 6500000000 recvbuf 0x7fefb3400000 recvoffset 6500000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 14 (0->8) sendbuf 0x7ff726c00000 sendoffset 7000000000 recvbuf 0x7fefb3400000 recvoffset 7000000000 count 500000000 (2.0000 GB) MPI
Bench 1 comm 15 (0->8) sendbuf 0x7ff726c00000 sendoffset 7500000000 recvbuf 0x7fefb3400000 recvoffset 7500000000 count 500000000 (2.0000 GB) MPI

CommBench 1: MPI communication matrix (reciever x sender) nnz: 16
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
16 . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
. . . . . . . . . . . . . . . . 
send footprint: 8000000000 32.0000 GB
recv footprint: 8000000000 32.0000 GB

5 warmup iterations (in order):
startup 1.23e+05 warmup: 1.45e+06
startup 1.52e+02 warmup: 1.33e+06
startup 1.32e+02 warmup: 1.33e+06
startup 1.32e+02 warmup: 1.33e+06
startup 1.30e+02 warmup: 1.33e+06
10 measurement iterations (sorted):
start: 1.2909e+02 time: 1.3293e+06 -> min
start: 1.3013e+02 time: 1.3293e+06
start: 1.3034e+02 time: 1.3293e+06
start: 1.3047e+02 time: 1.3293e+06
start: 1.3150e+02 time: 1.3294e+06
start: 1.3205e+02 time: 1.3294e+06 -> median
start: 1.3240e+02 time: 1.3294e+06
start: 1.3276e+02 time: 1.3294e+06
start: 1.3539e+02 time: 1.3295e+06
start: 1.3580e+02 time: 1.3295e+06 -> max

data: 32.0000 GB
minTime: 1.3293e+06 us, 4.1541e+01 ms/GB, 2.4073e+01 GB/s
medTime: 1.3294e+06 us, 4.1543e+01 ms/GB, 2.4071e+01 GB/s
maxTime: 1.3295e+06 us, 4.1546e+01 ms/GB, 2.4070e+01 GB/s
avgTime: 1.3294e+06 us, 4.1543e+01 ms/GB, 2.4071e+01 GB/s

from commbench.

wgropp avatar wgropp commented on August 23, 2024

If Frontier doesn't support MPI_Isend_c, it doesn't have a valid MPI 4.0 library.

from commbench.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.