Giter Club home page Giter Club logo

ib2slurm's Introduction

ib2slurm 0.2

A program that attempts to generate a slurm style topology.conf(5) file using infiniband network discovery services.

This version is based on the original version that can be found at bringhurst/ib2slurm on github.

Building

This version includes both a standard Unix Makefile as well as a CMakeLists.txt file for building the program using CMake.

Variables that control the CMake build include:

Variable Type Description
OFED_PREFIX Path Path at which OFED has been installed; defaults to /usr
USE_SLURM_HOSTLISTS Bool When enabled, use the hostlist functionality in libslurm to write in compact form; defaults to false
SLURM_PREFIX Path Path at which SLURM has been installed; no default

An example invocation might be:

$ mkdir build
$ cd build
$ cmake -DOFED_PREFIX=/usr \
>   -DUSE_SLURM_HOSTLISTS:BOOL=true \
>   -DSLURM_PREFIX=/home/1001/slurm/17.02.2 \
>   ..
-- The C compiler identification is GNU 4.4.7
-- Check for working C compiler: /usr/lib64/ccache/cc
-- Check for working C compiler: /usr/lib64/ccache/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/1001/ib2slurm/build
$ make
   :

Usage

  ib2slurm {options}

  [Fabric discovery]

    -C, --Ca <ca_name>           use the named CA
    -P, --Port <ca_port>         use the given port number on the CA
    -p, --progress               display progress information while fabric
                                 is discovered

  [Cached fabric]

    -l, --load-cache <path>      read a cached fabric definition from
                                 the file at the given path

  -o, --output <path>            write output topology configuration
                                 to the file at the given path
  -m, --node-name-map <path>     read CA-to-node-name map from the
                                 file at the given path
  -s, --lookup-names             map node GUIDs to names in the output
  -R, --deranged-lists           do not produce ranged name lists a'la SLURM
  -L, --linkspeed                include LinkSpeed values for switches
  -v, --verbose                  display additional information to stderr

  [version 0.2]

The program requires an in-memory representation of the InfiniBand fabric. The ibnetdiscover command includes a --cache option that writes that in-memory representation to a file:

$ ibnetdiscover ... --cache ib-topology.cache
$ ls -l ib-topology.cache
-rw-r--r-- 1 user group 134907 May  8 14:12 ib-topology.cache

The --load-cache option to ib2slurm allows such a file to be reused:

$ ib2slurm --load-cache ib-topology.cache

Lacking that option, the program will use the InfiniBand fabric discovery library to create an in-memory representation of the network. The --Ca and --Port options can be used to restrict the enumeration to a specific InfiniBand adapter or port on the adapter.

name_map is the location of a node name map file formatted as described in the ibnetdiscover(8) man page. The node name map entries must be compatible with SLURM's node naming scheme. If a name_map is not provided, the program will attempt to infer hostnames from the description field associated with host adapters by using the first word present. For example, the HCA description n000 HCA-1 would yield a node name of n000. Lacking any name mapping, the GUID of the node will be used (and may not be acceptable to SLURM as-is).

If the --linkspeed option is used, the program will lookup the native link speed and width for a switch and use the product of those two integers as the relative speed of the switch. For example, a Mellanox FDR SX6025 has a link width of 56 and a speed of 10, which would produce a LinkSpeed=560. Yes, this is 100% arbitrary and may not always work. But it's a start.

ib2slurm's People

Contributors

bringhurst avatar jtfrey avatar natecrawford avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.