nekcem / nekcem Goto Github PK

View Code? Open in Web Editor NEW

21.0 21.0 18.0 22.12 MB

Spectral-element solver for Maxwell's equations, drift-diffusion equations, and more.

Home Page: https://nekcem.mcs.anl.gov

License: Other

Python 0.87% Shell 0.60% MATLAB 0.14% Makefile 0.27% Fortran 59.93% C 36.83% CMake 0.04% C++ 0.54% Cuda 0.79%

discontinuous-galerkin drift-diffusion electromagnetism hpc spectral-elements

nekcem's Introduction

NekCEM

NekCEM is a discontinous-Galerkin, spectral-element solver for Maxwell's equations and the drift-diffusion equations written in Fortran and C. It runs efficiently in parallel on a wide variety of systems, from laptops to the supercomputers at the Argonne Leadership Computing Facility (ALCF) and the Oak Ridge Leadership Computing Facility (OLCF), including Nvidia GPUs. Its core data structure is based on the computational fluid dynamics code Nek5000.

Installing

Dependencies

To run simulations with NekCEM you will need the following things.

An MPI implementation.
Python 2.7 or higher (including all versions of Python 3).
BLAS and Lapack.

Some notes on the dependencies:

To keep things simple, make sure the compiler wrappers mpif77 (or mpifort) and mpicc are on your path. This isn't strictly necessary, but without them you will have to do more work when compiling simulations.
Python is only used in the build process.
The system version of Python on some ALCF and OLCF systems is 2.6; use softenv or modules to switch to a more recent version. Run soft add +python on a softenv system and module load python on a modules system.
Again to keep things simple, make sure you can link to BLAS and Lapack using -lblas and -llapack.

Standard install

To install NekCEM run

git clone https://github.com/NekCEM/NekCEM
cd NekCEM
sudo make install

The command make install does a couple of things.

It copies NekCEM/src and NekCEM/bin to /usr/local.
It symlinks some scripts to /usr/local/bin.

Note that installing to /usr/local is simply the default option; the install directory can be controlled in the standard way using the variables DESTDIR, prefix, and bindir.

Development install

If you want to help develop the NekCEM source code, first fork the NekCEM repo on Github. Then do

git clone https://github.com/<github-username>/NekCEM
cd NekCEM
git remote add upstream https://github.com/NekCEM/NekCEM
sudo make install_inplace

The command sudo make install_inplace only symlinks scripts to /usr/local/bin, allowing a developer to edit the source in their local clone while still having the necessary scripts on their path.

Running simulations with NekCEM

Setting up a simulation with NekCEM requires creating four files.

A user file. This is a Fortran file which contains various subroutines used to control the solvers. Its file extension should be usr.
A size file. This file contains compile-time parameters. It should be called SIZE.
A read file. This file contains parameters which are read at runtime. Its file extension should be rea.
A map file. This file contains the mapping between processors and elements. Its file extension should be map, and it must have the same stem as the read file.

A typical NekCEM simulation will be set up like this

example
├── readfile.map
├── readfile.rea
├── userfile.usr
└── SIZE

To build and run the code do the following from the example directory.

configurenek <solver> userfile
make
mpirun -np <number-of-processors> ./nekcem readfile &> log

Let's break down what's going on.

In the first step configurenek creates a makefile. The <solver> option determines which equations the application is targeting; it should be one of maxwell, drift, or shrod.
In the second step the makefile builds the code in the normal way; it produces an executable nekcem.
In the third step the code is run in the normal way for MPI applications.

The third step can be replaced with nek readfile <number-of-processors>. On a typical system this will do the exact same thing as mpirun, but on ALCF and OLCF machines it will also queue your job correctly.

Running the Tests

The tests can be run with bin/runtests [options]. For a complete list of options use the -h flag.

nekcem's People

Contributors

Stargazers

Watchers

Forkers

yinghe616 yhucd dengchangtao person142 ceed 0tt3r ping-hsuan abishekvenkit yslan eminsight misunmin abhikv brimacki lkampoli 5l1v3r1 yiminllin servomesh wieqli

nekcem's Issues

Porting to Sycl

Are you interested in having an (SYCL)[https://www.intel.com/content/www/us/en/developer/tools/oneapi/training/dpc-essentials.html#gs.bnjiaf] port of NekCEM as a new backend?

With the SYCL backend, we'd like to extend the existing functionalities of the NekCEM, by enabling the application to leverage the multi-core accelerator devices of Nvidia, AMD, and Intel vendor platforms

Why should `xmn`, `ymn`, `zmn` etc. exist?

They are just reshaped versions of xm1, ym1, zm1, and so on. They allow one to write xmn(k) instead of xm1(k,1,1,1), but is this worth all the extra variables floating around? At the very least they should be declared using equivalence statements so that they don't use a bunch of extra memory. Perhaps most tellingly these variables are not in Nek5000; why are they unnecessary there but necessary in NekCEM?

Clean up cem_param.F with seperating EMWAVE,PML and DRIFT

Both Maxwell and DRIFT called set_logic, so it shouldn't include EMWAVE and PML in this subroutine or every arrays in them will be declared within ifdef DRIFT.

pmlthick, pmlorder and pmlreferr depend PML
If 2D (ldim=2), DRIFT doesn't need to check iftm or ifte. After fixing this, plz change all 2D rea files in DRIFT (drift2dbcs).

tag @misunmin

Clean up `rdparam2`

Just so that we don't forget: currently rdparam sets some logical parameters here:

https://github.com/NekCEM/NekCEM/blob/development/src/nek5_connect2.F#L680

Most of these later get overwritten in set_logics:

https://github.com/NekCEM/NekCEM/blob/development/src/cem_param.F#L3

We should clean up rdparam so that it doesn't set these parameters.

Restructuring plan

Checklist

Test cases

Core

2dboxper: 2D box geometry with periodic BCs
2dboxpec: 2D box geometry with PEC BCs
2dboxpml: 2D box geometry with a PML
3dboxper: 3D box geometry with periodic BCs
3dboxpec: 3D box geometry with PEC BCs
3dboxpml: 3D box geometry with a PML
Tests for drift-diffusion/Schrodinger/Helmholtz.

Basic

dielectric: Flat dielectric interface.
waveguide: cylindrical/rectangular waveguide.
Tests for drift-diffusion/Schrodinger/Helmholtz.

Application

drude: Flat dielectric-Drude material interface
lorentz: Flat dielectric-Lorentz material interface
graphene: Flat sheet of graphene.
Tests for drift-diffusion/Schrodinger/Helmholtz.

Restructuring makenek

Follow Nek5000's structure.
Move to using makenek/makenek.inc/Makefile.template.

Code can hang if `lelx` is the wrong size

If lelx et. al. are too small for the desired box mesh, you get this error

-6  -6 (number of elements in x,y)
 Abort, increase lelx,lely and recompile           6           6           5           5

but the code continues to run (for an unknown longer amount of time) instead of exiting.

io / restart

Checklist

check Nek script concerning non-restart/restart
check required additional fields depending on applications
indication of usual & restart I/O producing vtk in logfile
perhaps, add material properties (maxwell/drift) into vtk output in usual I/O
test example
perhaps we might produce restart output files by default (with a lot less # of usual vtk output).

Modify how we handle material properties

Currently material properties are handled by looping over every point and calling uservp. This happens here:

https://github.com/NekCEM/NekCEM/blob/development/src/subs1.F#L225

There are some shortcomings to this approach now that we've added 2d materials and the ability to add incident fields on faces only. In particular, using these models requires setting material properties/markers on faces only, which isn't handled in a natural way with the current setup.

To deal with this, we're planning to:

Just call uservp once inside subs1.F and let any looping be handled in the .usr file. That way the user can loop over every point and/or faces as necessary.
Rename uservp to usermark.

`drift3d` test doesn't run with `np = 2`

The array sizes are incorrect:

element load imbalance:            0         256         256
 done :: mapelpr, mapping elements to processors\n
           0  NEL too large:         256         256         128         128

call exitt: dying ...

           1  NEL too large:         256         256         128         128

In particular, this means the test isn't actually getting run on TravisCI.

This is why exitt should have a return code...

Box mesh fails with PGI compiler

When running e.g. 2dboxpml on tesla with the PGI compiler one gets this error:

PGFIO-F-231/list-directed read/unit=99/error on data conversion.
 File name = box.tmp    formatted, sequential access   record = 1
 In source file /home/jwilson/Projects/NekCEM-dev/NekCEM/src/nek5_genbox.F, at line number 607
PGFIO-F-225/list-directed read/unit=99/lexical error-- unknown token type.
 File name = box.tmp    formatted, sequential access   record = 1
 In source file /home/jwilson/Projects/NekCEM-dev/NekCEM/src/nek5_genbox.F, at line number 625
--------------------------------------------------------------------------
mpiexec noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

Tasks for after the hackathon

Do usersol on the CPU
Fix cpu_p_t
Check errors in tests by scanning logs
Make shn and sen local to .usr files

CFL condition doesn't work with graphene

The ADEs for the graphene model are not compatible with the typical Maxwell CFL condition. To see this, try running the graphene test at a lower order; e.g. 6. Options for handling this:

think of a workaround
have the code error out
put a warning somewhere

A generic workaround probably isn't possible since there is an unlimited number of current models, so 2 might be the way to go.

Stale Branches

There are a few stale branches which haven't been updated in about 2 years:

Are any of these still needed? If so, would it be better for them to live in someone's fork? If anyone deals with one of them, check it off the list.

cc @misunmin @0tt3r @yinghe616.

Change `if3d` to be a parameter

See #135 (comment). Changing from 2D to 3D requires recompiling anyway, and this would allow the compiler to trim unnecessary 2D/3D branches at compile time.

Build failures with gcc 5 and 6

The build fails at the linking step, giving this error:

/usr/bin/ld: obj/io_rb.o: undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'
//lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [all] Error 1

This can be reproduced on e.g. churn when using +gcc-6.2.0 and +mpich-3.2-gcc-6.2.0 in softenv.

PML examples

Current 'tests/2dboxpml' and 'examples/pml2d' has only one direction PML.
We will need surrounded-PML examples in 2D/3D with point-source (checking the amplitude).

gfortran 8++ fails to compile

Same issue in Nek5000/Nek5000#497

As Stefan mentioned, the two options to solve this are

change to assumed size array e.g. real a(*)
compile with -std=legacy

Currently, the simple way the the second one, which works for gcc 8.2 and gcc 9.1

Should we keep local version of Nek5000 tools

Since Nek5000 are changing faster than NekCEM, the latest tools might be incompatible to the reader in NekCEM.

For example, NekCEM cannot read .map files with the header like "#v01".
User might feel pain to find the correct version of tools in Nek5000.

I think we can wait until new genmap is done in Nek5000. Just put a remind at here.

At current point, NekCEM only support, rea/re2 + map(old ?)

Hangs on Blues

When I run on Blues, the code hangs!

Has to recover those subroutines of DRIFT

For now we don't have proper test case, so restructure of the following subroutine is not there

drift_filter
cem_drift_project.F
BDF2
sem_drift_sem_bdf2
sem_drift_rhs_bdf2

Is there any chance to check the memory between compile and run?

I was running a paraview at the same time and I didn't notice that paraview occupied 70% memory such that NekCEM don't have enough space to run.

Is there any chance to check the memory between compile and run?

In this case, compile is ok while running causes the below messages: (which is hard to understand what happens)

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 18401 RUNNING AT sean-kubuntu-UX510UXK
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions

PML still unstable in TM case

To reproduce, take the 2dboxpml test and change it from a TE mode to a TM mode (just in the .rea file; the .usr file will automatically adjust). Then watch the numerical solution explode.

Clean up `parameter.F`

The file:

https://github.com/NekCEM/NekCEM/blob/development/src/parameter.F

has acquired a fair bit of cruft. In particular, there are many parameters defined which we don't use, and there are also many parameters defined which we do use but would be better handled in the .usr file (e.g. ifdrude, iflorenz, and ifgraph).

Also the hxactive stuff looks unecessary--it would probably be better to just check for TM/TE cases. Note that it is essentially replicated in cem_opt.F as well as in cem_dg.F; maybe this is a copy-paste job that never got cleaned up?

[NEWTON] Global variables conflict between gmres in Newton and gmres in bdf1

We use gmres to solve Newton iteration. When evaluating JacVec, there is a nested gmres in bdf1 (if using multi-grid gmres). However, they include same variables in "GMRES".

This might be the reason why mg not works in NEWTON, also, why NEWTON not works in STERIC (MGMRES)

Restart didn't support multiple species

I am not sure if this part intersect with Maxwell
but we really need a test case for restart...

Using drift, ldimt=3
nsteps = 40000
iocomm = 100
restart=10

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x7FEBB83BE777
#1 0x7FEBB83BED7E
#2 0x7FEBB7D0CCAF
#3 0x7FEBB7D5EDFA
#4 0x514162 in readheader4_ at io_co_read.c:108
#5 0x50CCCF in restart_swap_ at io.F:733
#6 0x40BE12 in cem_drift_init_ at cem_drift.F:114
#7 0x4C9AB1 in cem_solve_ at cem_drive.F:128
#8 0x4036C1 in nekcem at cem_drive.F:11
#9 0x7FEBB7CF7F44

Get_fast_BC_Error

I am getting this error when I make a custom mesh and converted with exo2nek for a 2D pipe. Not sure if this is a compatibility issue with nek for an externally made mesh?

call usrsetvert
done :: usrsetvert

gs_setup: 0 unique labels shared
handle bytes (avg, min, max): 8.26515e+06 8265148 8265148
buffer bytes (avg, min, max): 0 0 0
setupds time 3.3115E-01 seconds 4 8 346172 2548
0 4 1692 1 E get_fast_bc_error
-1 2 1692 1 EXO get_fast_bc_error
-1 1 1692 1 EXO get_fast_bc_error
0 3 1692 1 E get_fast_bc_error
0 5 1692 1 E get_fast_bc_error
-1 6 1692 1 EXO get_fast_bc_error
EXIT: Error A get_fast_bc 1692

an error occured: dying ...

`2dboxpec` has large errors compared to `2dboxper`

The errors for the 2dboxper and 2dboxpec tests are about:

	2dboxper	2dboxpec
L^2 error	5e-8	1e-6
L^oo error	1e-6	1e-5

The tests use the same order and run for the same number of steps with the same dt. Is the larger error for 2dboxpec expected?

LDFLAG is not handled properly by configurenek

If I set the following in makenek file,

LDFLAGS="-L/home/thilina/Repos/CEED/spack/opt/spack/linux-ubuntu17.10-x86_64/gcc-6.4.0/openblas-0.2.20-tzh6zl6jyufvohifypayse5ujxovvmx7/lib -lopenblas"

in the generated Makefile I get,

LDFLAGS = - L / h o m e / t h i l i n a / R e p o s / C E E D / s p a c k / o p t / s p a c k / l i n u x - u b u n t u 1 7 . 1 0 - x 8 6 _ 6 4 / g c c - 6 . 4 . 0 / o p e n b l a s - 0 . 2 . 2 0 - t z h 6 z l 6 j y u f v o h i f y p a y s e 5 u j x o v v m x 7 / l i b   - l o p e n b l a s

I think the culprit is the line:

LDFLAGS = ' '.join(LDFLAGS)

in the configurenek script. For the time being, as a workaround I am adding my LD flags
in EXTRALDFLAGS which is not a good thing.

Can you please look into this @misunmin @person142 ?

[DRIFT] multigrid with FDM doesn't support Robin BC

This happens when running on box mesh (no deformed geometry).
I get "get_fast_bc_error" in the logfile.

After digging in, there is no 'R' bc in the subroutine "get_fast_bc"
So, we should put this here cem_drift_mg.F#L3970
if (cbc(ied,e,ifield).eq.'R ') ibc = 2 ! Robin

I will make a PR for this after convergent tests.

When the error "nelt>lelt" happens, it won't automatically exit

I ran some case of DRIFT, with 24 core, forgot to calculate the nelt and change it in SIZE

And although the output message print "ABORT", but the code still running and occupying the core.
It should be automatically call exitt.

gfdm/ifgtp/ifre2 T F F
start reading mesh
ABORT: nelt>lelt! 20 50 42
ABORT: nelt>lelt! 21 50 42
ABORT: nelt>lelt! 22 50 42
ABORT: nelt>lelt! 23 50 42
rdbdry2: ifheat/ifmhd/ifflow= T F T

Remove `save` statements

The end goal being to make the code reentrant, which is convenient when we want to couple with other codes.

Get `newton-dirichlet` test running on Travis

As mentioned in #166 (comment), we need to change the name of the test to be of the form drift-* so that Travis picks it up.

nekcem / nekcem Goto Github PK

nekcem's Introduction

NekCEM

Installing

Dependencies

Standard install

Development install

Running simulations with NekCEM

Running the Tests

nekcem's People

Contributors

Stargazers

Watchers

Forkers

nekcem's Issues

Checklist

Test cases

Core

Basic

Application

Restructuring makenek

=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 18401 RUNNING AT sean-kubuntu-UX510UXK = EXIT CODE: 11 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

Recommend Projects

Recommend Topics

Recommend Org

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 18401 RUNNING AT sean-kubuntu-UX510UXK
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES