Giter Club home page Giter Club logo

gslib-deprecated's Introduction

A high-level view of the code in this directory is as follows. See each header
file listed for more documentation.

The following headers are fundamental to most of the code.

  name.h:    a given prefix is added to all external symbols;
             determines how FORTRAN routines are named
  types.h:   defines the integer types used everywhere (e.g., for array indices)
  mem.h:     memory-management wrappers;
             "array" type (generic dynamically sized array);
             "buffer" type ( = char array )
  comm.h:    wrappers for MPI calls (with alternative single proc versions)

The Gather/Scatter library top-level interface is defined in "gs.h".
The file "gs_defs.h" defines the datatypes and operations that it supports.

There are two coarse solvers (XXT and AMG), which are not currently very well
documented. The interface is given in "crs.h". 
 
"findpts" is documented in "findpts.c". The idea is that during a run of an
SEM code, we have a geometry map
  (processor, element, r, s, t) -> (x, y, z)
that defines our mesh. Within each element, the xyz coordinate is a
polynomial function of the parametric r,s,t coordinates.
"findpts" takes a distributed list of "(x,y,z)" points and computes the inverse
of the above map.
"findpts_eval" takes a list of "(proc,el,r,s,t)" coords, e.g., as returned by
  "findpts", and interpolates a given field at each point.


The "workhorses" of the implementations of much of the above are the
"sarray_sort" and "sarray_transfer" routines, documented in the respective
headers. The "array" type, defined in "mem.h", can be used to keep track of a
dynamically sized array of (arbitrary) structs.

  sarray_sort.h:     
    sort an array of structs (locally/sequentially) by one or two of its fields
  sarray_transfer.h:
    transfer each struct in array to the processor specified by a given field
    
These in turn, are implemented using the lower-level routines of
"sort.h", and "crystal_router.h".


The "findpts" algorithm makes use of a number of lower-level routines
possibly useful on their own.

  poly.h:     computation of quadrature nodes; fast polynomial interpolation
  lob_bnd.h:  (relatively) fast yet robust bounds for polynomials on [-1,1]^d
  obbox.h:    oriented as well as axis-aligned bounding boxes for spectral els
  tensor.h:   some tensor-product applications,
                with BLAS ops delegated to Nek, cblas, or a naive imp

All of the preprocessor macros that affect compilation are:
  name.h:  PREFIX="..."    prefix added to all C external symbols
          FPREFIX="..."    prefix added to all FORTRAN routines
    UPCASE, UNDERSCORE   determines FORTRAN naming convention
  types.h: USE_LONG, USE_LONG_LONG, GLOBAL_LONG, GLOBAL_LONG_LONG
           determine the integer types used by all code
  mem.h: PRINT_MALLOCS=1   (print all mem mngmt to stdout)
  comm.h: MPI  (use MPI when defined;
                otherwise, use a dummy single-proc implementation)
  tensor.h: USE_CBLAS, USE_NAIVE_BLAS
            (select BLAS implementation; default is Nek's mxm)
  fail.c: NO_NEK_EXITT    when defined, don't call Nek's exitt routine
  amg.c: AMG_BLOCK_ROWS   number of rows to read at a time (default=1200)
         GS_TIMING        record timings for the matrix multiplies
         GS_BARRIER       use a barrier to improve the quality of the timings



Differences from JL listed below
=====================
  Descriptions:
=====================

This directory (src/jl2) includes a newly tuned gather-scatter routine updated by
Matthew Otten (at Cornell), using the similar idea of OpenACC gather-scatter kernel
previously done by Aaron Vose (Cray Inc) and Matthew Otten in this paper:
       http://www.mcs.anl.gov/~mmin/hack_nekcem.pdf (pp.7, say version 15.0)

This newer version (src/jl2), say 15.1, has the following features:

      1. Keeps the same structure of James Lottes' original gs-routines;
      2. Uses new map arrays (for effective use of vectorization/streaming on GPU);
      3. Gives similar levels of performance on the CPU.

=====================
  How to use it:
=====================

To test with our newly tuned gather-scatter kernel: there would be nothing
to change in the source code. You simply change our makefile to compile with src/jl2
(instead of the current src/jl).

For OpenACC GPU runs w/  GPUDirect: require "-DGPUDIRECT" option at compile time.
For OpenACC GPU runs w/o GPUDirect: no additional option required.
For CPU-only runs                 : no additional option required.

=====================
  Difference:
=====================
The only difference between (src/jl) vs. (src/jl2) is the following files:
     src/jl2/gs.c
     src/jl2/gs_local.c
     src/jl2/gs_local.h
     src/jl2/comm.c



gslib-deprecated's People

Contributors

0tt3r avatar maxhutch avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gslib-deprecated's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.