ls_affinity

List CPU affinity. The program supports 4 different modes of operation; serial, multithreaded (using OpenMP), MPI, and hybrid OpenMP/MPI. This program is mainly meant as a debugging aid for understanding the interaction between MPI runtimes and workload managers.

Prerequisites

You need the hwloc library. To compile the MPI version of the program, you need a MPI library with development headers (typically including a mpicc compiler wrapper). For the OpenMP version, you need a compiler with OpenMP support. To build the program using the provided makefile, GNU make is (probably) required.

Building

The out-of-the-box makefile assumes GCC. If you use the Intel compiler, use the Makefile.intel makefile, that is, run "make -f Makefile.intel". If you use another compiler, you can make a copy of the main makefile with a suitable suffic such as "Makefile.foo", and edit it appropriately.

Running

Environment variables for OpenMP binding

As of OpenMP 3.1, there is the OMP_PROC_BIND environment variable which can be set to the values "true" or "false". For more specific binding schemes one must fall back on compiler specific methods. For GCC, use GOMP_CPU_AFFINITY, for the Intel compiler use KMP_AFFINITY.

Example

Running 2 MPI processes with 4 threads per rank on a computer with 8 hardware threads, by default with OpenMPI 1.4 one gets e.g.

$ OMP_NUM_THREADS=4 mpirun -n 2 ./ls_affinity_mpi_openmp 
On host XXX, MPI rank 1 thread 0 bound to PU(s) 0-7
On host XXX, MPI rank 0 thread 0 bound to PU(s) 0-7
On host XXX, MPI rank 1 thread 1 bound to PU(s) 0-7
On host XXX, MPI rank 1 thread 3 bound to PU(s) 0-7
On host XXX, MPI rank 1 thread 2 bound to PU(s) 0-7
On host XXX, MPI rank 0 thread 1 bound to PU(s) 0-7
On host XXX, MPI rank 0 thread 3 bound to PU(s) 0-7
On host XXX, MPI rank 0 thread 2 bound to PU(s) 0-7

By setting the affinity for both OpenMP and OpenMPI one gets

$ GOMP_CPU_AFFINITY=0-7 OMP_NUM_THREADS=4 mpirun -n 2 -bind-to-core -cpus-per-proc 4 ./ls_affinity_mpi_openmp 
On host XXX, MPI rank 0 thread 0 bound to PU(s) 0
On host XXX, MPI rank 1 thread 0 bound to PU(s) 4
On host XXX, MPI rank 0 thread 1 bound to PU(s) 1
On host XXX, MPI rank 1 thread 1 bound to PU(s) 5
On host XXX, MPI rank 1 thread 2 bound to PU(s) 6
On host XXX, MPI rank 1 thread 3 bound to PU(s) 7
On host XXX, MPI rank 0 thread 2 bound to PU(s) 2
On host XXX, MPI rank 0 thread 3 bound to PU(s) 3

In the program output, "PU(s)" means "processing unit", per the hwloc terminology: "The smallest processing element that can be represented by a hwloc object. It may be a single-core processor, a core of a multicore processor, or a single thread in a SMT processor. hwloc's PU acronym stands for Processing Unit. "

jabl / ls_affinity Goto Github PK

ls_affinity's Introduction

ls_affinity

Prerequisites

Building

Running

Environment variables for OpenMP binding

Example

ls_affinity's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent