Giter Club home page Giter Club logo

mp-sort's Introduction

MP-sort

https://travis-ci.org/rainwoodman/MP-sort.svg?branch=master

DOI: .. image:: https://zenodo.org/badge/22829423.svg

target:https://zenodo.org/badge/latestdoi/22829423

A Massively Parallel Sorting Library. The library implements a histogram sort. The scaling of MP-sort up to 160,000 MPI ranks has been studied by [1]. MP-Sort is the sorting module in BlueTides Simulation [2].

MP-sort can be very useful for BigData simulation and analysis on traditional HPC platforms with MPI.

We provide

  • a C interface that can be easily integrated into data producers, such as large scale simulation applications; and
  • a Python interface that can be easily integrated into data consumers, such as parallel Python scripts that query simulation data.

A real world Python example is https://github.com/bccp/nbodykit/blob/master/ichalo.py , where mpsort is used twice to match up properties of particles cross two different snapshots, to establish the motion of Dark Matter halos cross cosmic times.

Install

Use the Makefile to build / install the .a targets, and link against

-lradixsort -lmpsort-mpi

Makefile supports overriding CC, MPICC and CFLAGS

The header to include is

#include <mpsort.h>

The python binding can be installed with

python setup.py install [--user]

The same Makefile overrides are supported, if the environment variables are set.

Usage: C

The basic C interface is:

void mpsort_mpi(void * base, size_t nmemb, size_t size,
    void (*radix)(const void * ptr, void * radix, void * arg),
    size_t rsize,
    void * arg, MPI_Comm comm);

/*
Parameters
----------
base :
    base pointer of the local data
nmemb :
    number of items in the local data
elsize :
    size of an item in the local data
radix (ptr, radix, arg):
    the function calculates the radix of an item at ptr;
    the raxis shall be stored at memory location pointed to radix
rsize :
    the size of radix array in bytes; only the behavior for 8 is well-defined:
    the radix is interpreted as uint64.
arg   :
    argument to pass into radix()
comm  :
    the MPI communicator for the sort.

*/

Usage: Python

The basic Python interface is:

import mpsort

mpsort.sort(localdata, orderby=None, comm=None, tuning=[])

"""
Sort an distributed array in place.

Parameters
----------
localdata : array_like
    local data, must be C_CONTIGUOUS, and of a struct-dtype.
    for example, :code:`localdata = numpy.empty(10, dtype=[('key', 'i4'), ('value', 'f4')])`.
orderby : scalar
    the field to be sorted by. The field must be of an integral type. 'i4', 'i8', 'u4', 'u8'.

"""

Tuning

For runs with very large number of ranks, we may experience slow down due to the backend selecting a conservative MPI_Allreducev implementation. For Cray MPI, the following environment helps:

export MPICH_ALLREDUCE_BLK_SIZE=$((4096*1024*2))

There are also flags controlling the algorithm used by MPI_Alltoallv.

If the communication is very sparse (e.g. mostly sorted data). Then using a sparse algorithm based on non-blocking send / recv may provide better performance; some MPI implementations do not automatically switch the algorithm. We can allow mpsort to automatically select a sparse algorithm by setting the environment MPSORT_ENABLE_SPARSE_ALLTOALLV, calling mpsort_mpi_set_option(MPSORT_ENABLE_SPARSE_ALLTOALLV), or passing 'ENABLE_SPARSE_ALLTOALLV' to the tuning argument of the python interface.

[1]Feng, Y., Straka, M., Di Matteo, T., Croft, R., MP-Sort: Sorting for a Cosmological Simulation on BlueWaters, Cray User Group 2015
[2]Feng et. al, BlueTides: First galaxies and reionization, Monthly Notices of the Royal Astronomical Society, 2015, submitted

mp-sort's People

Contributors

mortele avatar rainwoodman avatar sbird avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.