I open this issue to discuss the changes to SPEC in order to write all output quantiti

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

(Using Intel) The other example (<a href="https://github.com/PrincetonUniversity/SPEC/

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

All output quantities into one HDF5 file,about princetonuniversity/spec

Comments (35)

zhucaoxiang commented on September 23, 2024

@jonathanschilling
Let's discuss this issue here.
When using gfortran, I can successfully compile the new version. But when execute the code, I got this run-time error.

HDF5-DIAG: Error detected in HDF5 (1.10.1) HDF5-DIAG: Error detected in HDF5 (1.10.1) MPI-process 1:
HDF5-DIAG: Error detected in HDF5 (1.10.1) MPI-process 2:
  #000: H5G.c line 323 in H5Gcreate2(): unable to create group
HDF5-DIAG: Error detected in HDF5 (1.10.1) MPI-process 3:
  #000: H5G.c line 323 in H5Gcreate2(): unable to create group
    major: Symbol table
    minor: Unable to initialize object
  #001: H5Gint.c line 161 in H5G__create_named(): unable to create and link to group
    major: Symbol table
    minor: Unable to initialize object
  #002: H5L.c line 1695 in H5L_link_object(): unable to create new link to object
HDF5-DIAG: Error detected in HDF5 (1.10.1) MPI-process 4:
  #000: H5G.c line 323 in H5Gcreate2(): unable to create group
    major: Symbol table
    minor: Unable to initialize object
  #001: H5Gint.c line 161 in H5G__create_named(): unable to create and link to group
    major: Symbol table
    minor: Unable to initialize object
  #002: H5L.c line 1695 in H5L_link_object(): unable to create new link to object
    major: Links
    minor: Unable to initialize object
  #003: H5L.c line 1939 in H5L_create_real(): can't insert link
    major: Symbol table
    minor: Unable to insert object
  #004: H5Gtraverse.c line 867 in H5G_traverse(): internal path traversal failed
    major: Symbol table
    minor: Object not found

I am using hdf5-parallel/1.10.1 incorporated with GCC/8.1.0 and openmpi/3.0.0.

Besides, I still need to figure out why there is this error.

m4 -P macros newton.f90 > newton_m.F90
mpif90  -r8 -mcmodel=large -O3 -m64 -unroll0 -fno-alias -ip -traceback -o newton_r.o -c newton_m.F90 -I/usr/pppl/intel/2017-pkgs/openmpi-1.10.3-pkgs/fftw-3.3.7/include
newton_m.F90(83): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [HDF5]
  use sphdf5, only: write_convergence_output
------^
newton_m.F90(83): error #6580: Name in only-list does not exist.   [WRITE_CONVERGENCE_OUTPUT]
  use sphdf5, only: write_convergence_output
--------------------^

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang
Could you please include the input ext.sp file you used so that I can try to reproduce the error you got?
The error means that some call to the HDF5 API got a wrong parameter.

Concerning the compile error, which CC part of the Makefile are you using?

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang
I think the compile error is due to a missing include instruction for the intel compiler.
Since sphdf5 needs the HDF5 library, you also need to include $(HDF5compile) in the makefile line for all the other files.
I pushed the corresponding change in the Makefile; could you please pull it and see if this solves your issue?

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling
This testing input file I am using is at here.

When using Intel compiler, I am using the default flags. When using GCC, I am using the gfortran one. But I revised the Makefile a little bit (the commit has been pushed).

I think the compile error is due to a missing include instruction for the intel compiler.

You are correct. I can now successfully compile SPEC with Intel compiler. However, the same runtime error appears. I will test another case.

from spec.

zhucaoxiang commented on September 23, 2024

(Using Intel) The other example (G3V02L1Fi.sp) also exits abnormally. The error is pointed to the line call h5dset_extent_f(iteration_dset_id, data_dims, hdfier) in sphdf5.f90.

xspec              0000000000437981  sphdf5_mp_write_c        5295  sphdf5_m.F90
xspec              00000000004FF97D  fcn2_                    1284  newton_m.F90
xspec              00000000005395F4  hybrj_                    754  minpack.f
xspec              00000000004F7CFE  newton_                   419  newton_m.F90
xspec              00000000005A3A0B  MAIN__                    507  xspech_m.F90

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang
Okay, so far I tested the new output routines only with a single input file (G3V02L1Fi.001.sp from InputFiles/TestCases with much reduced tolerances) to quickly check if something obvious is going wrong. I will have a more detailed look using the input file you pointed out.

I am glad that the compiler errors are gone now :-)

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling
That could be the reason. Thanks for pointing out the solution to the compiling error.

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang
Okay, I found the error occuring in your run of solovev_fb_vmec7vol_final.sp .
During the free-boundary iterations, the vector potential is written to the output file in each free-boundary iteration. Therefore, after the first iteration, the already-existing dataset has to be re-opened instead of trying to create it. I adjusted the code and pushed the changes. Please have a look if that case works now.

(Using Intel) The other example (G3V02L1Fi.sp) also exits abnormally.

I will investigate this next.

Thanks a lot so far for these helpful bug reports!

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling
The solovev case is now working, except there is a warning

ending :            : 
There are still  50 HDF5 objects open!

Probably, you need to manually close hdf5 objects.

The G3V02L1Fi case still has the run-time error as shown above.

Thanks for your debugging. That's the fascination of open-source collaboration.

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang
I could reproduce the warning from the solovev case and found it to be a false positive as I asked the HDF5 library for any open objects and it seems to also count internal objects into that. Limiting the check to the file actually opened by SPEC solves this.
I commited the relevant changes.

Up to now I was not able to reproduce the runtime error in the G3V02L1Fi.001.sp case.
Could you please re-check with the current status of SPEC?

That's the fascination of open-source collaboration.

I strongly agree :)

from spec.

jloizu commented on September 23, 2024

I have pulled the issu68 branch and compiled and executed tests successfully. However, in trying to reproduce the verification proposed in Utilities/SPEC_HDF5_output_verification.txt I do get into some trouble:

When calling, e.g., fdata = read_spec_field('G3V01L0Fi.002.sp.h5') I get

- File "/.G3V01L0Fi.002.sp.A" does not exist

which means that I cannot read from the directory where the file is, and it only works if go outside it, and then call, e.g., fdata = read_spec_field('somefoldername/G3V01L0Fi.002.sp.h5'). It would be good to allow for the former as well.

When calling specheck(...) after loading fdata,pdata,idata,gdata, and data, I get

...
ERROR: gdata.Rij        
ERROR: gdata.Zij        
ERROR: gdata.sg         
ERROR: gdata.BR         
ERROR: gdata.Bp         
ERROR: gdata.BZ    
...
Not maching :(

and I checked that the maximum difference between, e.g., gdata.Rij and data.grid.Rij is of order 2e-15, so I am not sure what is the reason but it is probably a minor thing related to precision?

from spec.

jonathanschilling commented on September 23, 2024

Thanks for the feedback.

This should be fixed now. The error was in the path handling, which indeed (erroneously) assumed a parent folder.
This is due to compiler optimizations, which slightly change the order of operations performed on the data. Therefore, different rounding errors (on the order of machine precision, as you noticed) occur and are identified by the checking routine as mismatched. Please change the -O3 flag (or the corresponding compiler optimization flag) into -O0 to disable optimizations and re-run the code. This slows down execution quite a bit, but since we only need to do this once for the output comparison, I think it is ok?

from spec.

jloizu commented on September 23, 2024

Thanks for the feedback.

1. This should be fixed now. The error was in the path handling, which indeed (erroneously) assumed a parent folder.

2. This is due to compiler optimizations, which slightly change the order of operations performed on the data. Therefore, different rounding errors (on the order of machine precision, as you noticed) occur and are identified by the checking routine as mismatched. Please change the -O3 flag (or the corresponding compiler optimization flag) into -O0 to disable optimizations and re-run the code. This slows down execution quite a bit, but since we only need to do this once for the output comparison, I think it is ok?

OK so issue 1 is resolved now, I checked.
For issue 2, I compiled without -O2 option but I still get NO MATCHING for gdata arrays (as before). Could it be due to the other compilation options? I have these:

CFLAGS=-r8
RFLAGS=-O2 -ip -no-prec-div -xHost -fPIC

from spec.

jonathanschilling commented on September 23, 2024

OK so issue 1 is resolved now, I checked.

That is nice to hear.

compiled without -O2 option but I still get NO MATCHING

According to the Intel documentation, -O2 is the default and you have to explicitly set -O0 to disable optimizations.

from spec.

jloizu commented on September 23, 2024

Oups :) you are right

OK with the -O0 option it now gives the same! Yeah!

from spec.

jonathanschilling commented on September 23, 2024

Hey, that's cool! The pull request is approaching... ;-)

I currently still have the following items on my to-do list:

put LaTeX comments and HDF5 attributes consistenty on all output variables (currently only done for /input/physics)
write/read Hessian (.GF.*, hesian)
write/read derivative matrix (.sp.DF, newton)
read plasma normal field on boundary (.Vn, read in xspech)
verify also for slab and cylinder geometry using InputFiles/TestCases/G{1,2}*

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling Great job!
I can also contribute some of my python scripts. I will push my commits in the branch and let you know.

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling I just pushed one commit to fix bugs with debug version in sphdf5.f90.

I found a weird bug with mirror_input in sphdf5.f90. I pushed a test input file in Hotfix_hdf5 branch at test_hdf5.sp.

I was able to run this input once and got some warnings with "divided by zero". Here is a screenshot of log files.

xspech :            : 
       :  compiled  : date    = Thu Jul 25 11:48:32 EDT 2019 ; 
       :            : srcdir  = /p/focus/share/spec ; 
       :            : macros  = macros ; 
       :            : fc      = mpifort ; 
       :            : flags   =  -fdefault-real-8 -O2 -ffixed-line-length-none -ffree-line-length-none -fexternal-blas ; 
xspech :            : 
xspech :       0.00 : begin execution ; ncpu=  8 ; calling global:readin ;
readin :            : 
readin :       0.00 : date=2019/07/25 , 14:00:21 ; machine precision= 1.11E-16 ; vsmall= 1.11E-14 ; small= 1.11E-12 ;
readin :            : 
readin :       0.00 : ext = test_hdf5                                                                                           
readin :            : 
readin :            : 
readin :       0.00 : Igeometry=  3 ; Istellsym=  1 ;
readin :            : Lfreebound=  1 ; phiedge=  1.000000000000000E+00 ; curtor= -1.190644396046887E-01 ; curpol=  4.881440045660601E+00 ;
readin :            : gamma=  0.000000000000000E+00 ;
readin :            : Nfp=  1 ; Nvol=  7 ; Mvol=  8 ; Mpol= 16 ; Ntor=  0 ;
readin :            : pscale=  1.55477E-07 ; Ladiabatic= 0 ; Lconstraint=  1 ; mupf: tol,its=  1.00E-12 , 128 ;
readin :            : Lrad =  8, 8, 8, 8, 8, 8, 8,10,
readin :            : 
readin :       0.00 : Linitialize=  1 ; Lzerovac= 0 ; Ndiscrete= 2 ;
readin :            : Nquad=  -1 ; iMpol=  -4 ; iNtor=  -4 ;
readin :            : Lsparse= 0 ; Lsvdiota= 1 ; imethod= 3 ; iorder= 2 ; iprecon= 1 ; iotatol= -1.00000E+00 ;
readin :            : Lextrap= 0 ; Mregular= -1 ;
readin :            : 
readin :       0.00 : LBeltrami= 4 ; Linitgues= 1 ;
readin :            : 
readin :       0.00 : Lfindzero= 2 ;
readin :            : escale=  0.00000E+00 ; opsilon=  1.00000E+00 ; pcondense=  4.000 ; epsilon=  1.00000E-01 ; wpoloidal= 1.0000 ; upsilon=  1.00000E-01 ;
readin :            : forcetol=  1.00000E-12 ; c05xmax=  1.00000E-06 ; c05xtol=  1.00000E-12 ; c05factor=  1.00000E-04 ; LreadGF= F ; 
readin :            : mfreeits=   5 ; gBntol=  1.00000E-10 ; gBnbld=  0.00000E+00 ;
readin :            : vcasingeps=  1.00000E-12 ; vcasingtol=  1.00000E-10 ; vcasingits=  8 ; vcasingper=  1 ;
readin :            : 
readin :       0.00 : odetol=  1.00E-07 ; nPpts=   500 ;
readin :            : LHevalues= F ; LHevectors= F ; LHmatrix= F ; Lperturbed= 0 ; dpp= -1 ; dqq= -1 ; Lcheck=  1 ; Ltiming= F ;
readin :            : 
readin :            : myid=  0 ; Rscale= 3.999000000000000E+00 ;
preset :            : myid=  0 ; Mrad= 10 : Lrad=  8,  8,  8,  8,  8,  8,  8, 10,
preset :       0.00 : LBsequad= F , LBnewton= F , LBlinear= T ;
preset :            : 
preset :       0.00 : Nquad=  -1 ; mn=   17 ; NGdof=   231 ; NAdof=   341,   358,   358,   358,   358,   358,   358,   426,
preset :            : 
preset :       0.00 : Nt=   128 ; Nz=     1 ; Ntz=      128 ;
newton :      14.98 :         0  0 ; |f|= 3.84584E-03 ; time=      0.25s ; log|BB|e= -5.11 -4.96 -4.83 -4.68 -4.52 -4.38 -4.18
newton :            :              ;                                     ; log|II|o= -4.27 -3.45 -3.03 -2.81 -2.50 -2.27 -2.88
fcn2   :      16.70 :         1  1 ; |f|= 3.84584E-03 ; time=      1.97s ; log|BB|e= -5.11 -4.96 -4.83 -4.68 -4.52 -4.38 -4.18
fcn2   :            :              ;                                     ; log|II|o= -4.27 -3.45 -3.03 -2.81 -2.50 -2.27 -2.88
fcn2   :      16.84 :         2  1 ; |f|= 3.77091E-03 ; time=      0.15s ; log|BB|e= -5.12 -4.96 -4.82 -4.68 -4.53 -4.35 -4.17
fcn2   :            :              ;                                     ; log|II|o= -4.27 -3.45 -3.04 -2.82 -2.50 -2.28 -2.88
fcn2   :      17.01 :         3  1 ; |f|= 3.62332E-03 ; time=      0.17s ; log|BB|e= -5.14 -4.96 -4.81 -4.68 -4.55 -4.30 -4.14
fcn2   :            :              ;                                     ; log|II|o= -4.28 -3.46 -3.05 -2.84 -2.51 -2.30 -2.87
fcn2   :      17.17 :         4  1 ; |f|= 3.33740E-03 ; time=      0.16s ; log|BB|e= -5.18 -4.96 -4.80 -4.67 -4.60 -4.22 -4.09
fcn2   :            :              ;                                     ; log|II|o= -4.30 -3.49 -3.08 -2.87 -2.53 -2.34 -2.84
fcn2   :      17.33 :         5  1 ; |f|= 2.80233E-03 ; time=      0.16s ; log|BB|e= -5.26 -4.95 -4.76 -4.68 -4.67 -4.10 -4.01
fcn2   :            :              ;                                     ; log|II|o= -4.35 -3.54 -3.15 -2.95 -2.57 -2.43 -2.82
fcn2   :      17.50 :         6  1 ; |f|= 1.85834E-03 ; time=      0.16s ; log|BB|e= -5.50 -4.93 -4.68 -4.75 -4.72 -3.98 -3.91
fcn2   :            :              ;                                     ; log|II|o= -4.48 -3.69 -3.33 -3.10 -2.73 -2.71 -2.89
fcn2   :      17.65 :         7  1 ; |f|= 6.57691E-04 ; time=      0.16s ; log|BB|e= -6.16 -4.98 -4.71 -5.08 -4.45 -4.13 -4.00



spech :      73.69 : myid=  0 : time=    1.23m =   0.02h =  0.00d ;
ending :            : 
ending :      73.72 : myid=  0 ; completion ; time=     73.72s =     1.23m =   0.02h =  0.00d ; date= 2019/07/25 ; time= 14:01:35 ; ext = test_hdf5                                                   
ending :            : 
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO IEEE_UNDERFLOW_FLAG IEEE_DENORMAL

Then I tried to compile a debug version of executable and found bugs in sphdf5.f90. I fixed the bugs and ran dspec with the same input file. It stuck at mirror_input_to_outfile, even though I reverted to the old version (without my commit) and executed with xspec. I still stuck at this subroutine without any info.

Here are the modules I used:

Currently Loaded Modulefiles:
  1) gv/3.7.4               4) emacs/24.2             7) kusari/2.2-88         10) idl/8.4               13) openmpi/4.0.1         16) lapack/3.5.0rhel6
  2) git/2.21.0             5) flexlm/common          8) matlab/r2018b         11) gcc/9.1.0             14) hdf5-parallel/1.10.5  17) /spec/gfortran
  3) vtk/5.0.3              6) flexlm/11.14.1.2       9) texlive/2012          12) szip/2.1.1            15) fftw/3.3.8

Can you pull the branch Hotfix_hdf5 to see if you could run the test case normally?

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang

I can also contribute some of my python scripts. I will push my commits in the branch and let you
know.

Thanks, that sounds quite interesting! Maybe this is also relevant for @KseniaAleynikova ?

I pulled your branch and compiled dspec and xspec on my local Debian machine using CC=gfortran_ubuntu and FC=mpif90.

First I ran xspec:
mpirun -n 8 ./xspec test_hdf5.sp 2>&1 | tee test_hdf5_xspec_gfortran_ubuntu.log
It ran without issues and produced test_hdf5_xspec_gfortran_ubuntu.log.

Then I ran dspec:
mpirun -n 8 ./dspec test_hdf5.sp 2>&1 | tee test_hdf5_dspec_gfortran_ubuntu.log
It produced an error from trying to deallocate the iwork array in tr00ab. The output can be found in test_hdf5_dspec_gfortran_ubuntu.log. Note that I changed the output unit to 6 (stdout) also in the macros file to log this error to a file without redirecting it explicitly. Then I noticed that the MPI error messages etc. are also in stderr and added 2>&1 in the command line above to redirect stderr to stdout.

The IEEE_... error messages sometimes appeared when I ran SPEC before changing the output format. In fact, since some of the field line following tasks failed (see test_hdf5_xspec_gfortran_ubuntu.log), I assume that they originate from there.

I will now try again with the Intel compiler on the Draco cluster at IPP to be sure the error is not compiler-specific.

from spec.

jonathanschilling commented on September 23, 2024

I ran xspec on Draco and it produced some error I cannot relate to the output file at the moment.
A log is available as test_hdf5_xspec_Draco_Intel.log.

Running Intel-dspec on test_hdf5.sp leads to the same deallocation error as on Debian.

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling Thanks for your help. I will try it again on different nodes.

Later, I also tested Intel compiler and it also stuck at mirror_input.
I was using FC=mpifort for both intel and CC=gfortran, because I got use mpi erorr with FC=mpif90.
I observed the iwork not allocated error only once. Somehow, for all the other runnings, SPEC was staying in mirror_input.
I only loaded hdf5-parallel, without hdf5-serial. I guess this is not a problem.
IEEE_DIVIDED_BY_ZERO error might not affect normal usage, but it is a little bit annoying. I will try to get rid of it later.
I guess iwork not allocated is a bug that we need to fix.

from spec.

zhucaoxiang commented on September 23, 2024

I switched to another computation node on PPPL. And it works with intel compiler. Somehow, I got the same error as your Draco log file. This is stranged.

from spec.

zhucaoxiang commented on September 23, 2024

xspec got det small error.

czhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> mpiexec /p/focus/share/spec/xspec test_hdf5.sp
xspech :            : 
       :  compiled  : date    = Thu Jul 25 15:49:00 EDT 2019 ; 
       :            : srcdir  = /p/focus/share/spec ; 
       :            : macros  = macros ; 
       :            : fc      = mpifort ; 
       :            : flags   =  -r8 -mcmodel=large -O0 -m64 -unroll0 -fno-alias
  -ip -traceback ; 
xspech :            : 
xspech :       0.00 : begin execution ; ncpu=  8 ; calling global:readin ;
readin :            : 
readin :       0.00 : date=2019/07/26 , 09:02:33 ; machine precision= 1.11E-16 ; vsmall= 1.11E-14 ; small= 1.11E-12 ;
readin :            : 
readin :       0.00 : ext = test_hdf5                                                                                           
readin :            : 
readin :            : 
readin :       0.00 : Igeometry=  3 ; Istellsym=  1 ;
readin :            : Lfreebound=  1 ; phiedge=  1.000000000000000E+00 ; curtor= -1.190644396046887E-01 ; curpol=  4.881440045660601E+00 ;
readin :            : gamma=  0.000000000000000E+00 ;
readin :            : Nfp=  1 ; Nvol=  7 ; Mvol=  8 ; Mpol= 16 ; Ntor=  0 ;
readin :            : pscale=  1.55477E-07 ; Ladiabatic= 0 ; Lconstraint=  1 ; mupf: tol,its=  1.00E-12 , 128 ;
readin :            : Lrad =  8, 8, 8, 8, 8, 8, 8,10,
readin :            : 
readin :       0.00 : Linitialize=  1 ; Lzerovac= 0 ; Ndiscrete= 2 ;
readin :            : Nquad=  -1 ; iMpol=  -4 ; iNtor=  -4 ;
readin :            : Lsparse= 0 ; Lsvdiota= 1 ; imethod= 3 ; iorder= 2 ; iprecon= 1 ; iotatol= -1.00000E+00 ;
readin :            : Lextrap= 0 ; Mregular= -1 ;
readin :            : 
readin :       0.00 : LBeltrami= 4 ; Linitgues= 1 ;
readin :            : 
readin :       0.00 : Lfindzero= 2 ;
readin :            : escale=  0.00000E+00 ; opsilon=  1.00000E+00 ; pcondense=  4.000 ; epsilon=  1.00000E-01 ; wpoloidal= 1.0000 ; upsilon=  1.00000E-01 ;
readin :            : forcetol=  1.00000E-12 ; c05xmax=  1.00000E-06 ; c05xtol=  1.00000E-12 ; c05factor=  1.00000E-04 ; LreadGF= F ; 
readin :            : mfreeits=   5 ; gBntol=  1.00000E-10 ; gBnbld=  0.00000E+00 ;
readin :            : vcasingeps=  1.00000E-12 ; vcasingtol=  1.00000E-10 ; vcasingits=  8 ; vcasingper=  1 ;
readin :            : 
readin :       0.00 : odetol=  1.00E-07 ; nPpts=   500 ;
readin :            : LHevalues= F ; LHevectors= F ; LHmatrix= F ; Lperturbed= 0 ; dpp= -1 ; dqq= -1 ; Lcheck=  1 ; Ltiming= F ;
readin :            : 
readin :            : myid=  0 ; Rscale= 3.999000000000000E+00 ;
preset :            : myid=  0 ; Mrad= 10 : Lrad=  8,  8,  8,  8,  8,  8,  8, 10,
preset :       0.03 : LBsequad= F , LBnewton= F , LBlinear= T ;
preset :            : 
preset :       0.03 : Nquad=  -1 ; mn=   17 ; NGdof=   231 ; NAdof=   341,   358,   358,   358,   358,   358,   358,   426,
preset :            : 
preset :       0.03 : Nt=   128 ; Nz=     1 ; Ntz=      128 ;
after preset
after init_outfile
after mirror_input
ma02aa :       3.49 : myid=  7 ; lvol=  8 ; Linear : ihybrj =  5 hel=  1.0095E+05 mu=  0.0000E+00 dpflux= -2.8167E-02 time=      0.4 ; bad progress     ; F=  2.E-01 -9.E-14
newton :       3.50 :         0  0 ; |f|= 3.84584E-03 ; time=      1.30s ; log|BB|e= -5.11 -4.96 -4.83 -4.68 -4.52 -4.38 -4.18
newton :            :              ;                                     ; log|II|o= -4.27 -3.45 -3.03 -2.81 -2.50 -2.27 -2.88
ma02aa :       4.72 : myid=  7 ; lvol=  8 ; Linear : ihybrj =  5 hel=  1.0095E+05 mu=  0.0000E+00 dpflux= -2.8167E-02 time=      0.4 ; bad progress     ; F=  2.E-01 -9.E-14
ma02aa :       6.04 : myid=  7 ; lvol=  8 ; Linear : ihybrj =  5 hel=  1.0095E+05 mu=  0.0000E+00 dpflux= -2.8167E-02 time=      0.5 ; bad progress     ; F=  2.E-01 -9.E-14
fcn2   :      35.46 :         1  1 ; |f|= 3.84584E-03 ; time=     33.26s ; log|BB|e= -5.11 -4.96 -4.83 -4.68 -4.52 -4.38 -4.18
fcn2   :            :              ;                                     ; log|II|o= -4.27 -3.45 -3.03 -2.81 -2.50 -2.27 -2.88
ma02aa :      36.74 : myid=  7 ; lvol=  8 ; Linear : ihybrj =  5 hel= -4.6523E+06 mu=  0.0000E+00 dpflux=  4.8183E+06 time=      0.5 ; bad progress     ; F=  2.E-01 -1.E-10
fcn2   :      36.75 :         2  1 ; |f|= 8.17317E+11 ; time=      1.29s ; log|BB|e= -5.68 -4.92 -4.68 -4.63 -5.26 -4.10 12.24
fcn2   :            :              ;                                     ; log|II|o= -4.46 -3.71 -3.37 -3.22 -2.64 -2.78 -2.65
ma02aa :      38.02 : myid=  7 ; lvol=  8 ; Linear : ihybrj =  5 hel=  2.0781E+16 mu=  0.0000E+00 dpflux=  4.8183E+06 time=      0.4 ; bad progress     ; F=  2.E-01  2.E-04
ma02aa :      39.35 : myid=  7 ; lvol=  8 ; Linear : ihybrj =  5 hel=  3.8511E+16 mu=  0.0000E+00 dpflux=  4.8183E+06 time=      0.5 ; bad progress     ; F=  2.E-01  1.E-04
dforce :      fatal : myid=  7 ; abs(det).lt.small ; error computing derivatives of dtflux & dpflux wrt geometry at fixed Itor and Gpol ;
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 7 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

dspec got iwork error

czhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> mpiexec /p/focus/share/spec/dspec test_hdf5.sp
xspech :            : 
       :  compiled  : date    = Fri Jul 26 09:05:07 EDT 2019 ; 
       :            : srcdir  = /p/focus/share/spec ; 
       :            : macros  = macros ; 
       :            : fc      = mpifort ; 
       :            : flags   =  -r8 -check bounds -check format -check output_c
 onversion -check pointers -check uninit -debug full -D DEBUG ; 
xspech :            : 
xspech :       0.00 : begin execution ; ncpu=  8 ; calling global:readin ;
readin :            : 
readin :       0.00 : date=2019/07/26 , 09:06:35 ; machine precision= 1.11E-16 ; vsmall= 1.11E-14 ; small= 1.11E-12 ;
readin :            : 
readin :       0.00 : ext = test_hdf5                                                                                           
readin :            : 
readin :            : 
readin :       0.00 : Igeometry=  3 ; Istellsym=  1 ;
readin :            : Lfreebound=  1 ; phiedge=  1.000000000000000E+00 ; curtor= -1.190644396046887E-01 ; curpol=  4.881440045660601E+00 ;
readin :            : gamma=  0.000000000000000E+00 ;
readin :            : Nfp=  1 ; Nvol=  7 ; Mvol=  8 ; Mpol= 16 ; Ntor=  0 ;
readin :            : pscale=  1.55477E-07 ; Ladiabatic= 0 ; Lconstraint=  1 ; mupf: tol,its=  1.00E-12 , 128 ;
readin :            : Lrad =  8, 8, 8, 8, 8, 8, 8,10,
readin :            : 
readin :       0.00 : Linitialize=  1 ; Lzerovac= 0 ; Ndiscrete= 2 ;
readin :            : Nquad=  -1 ; iMpol=  -4 ; iNtor=  -4 ;
readin :            : Lsparse= 0 ; Lsvdiota= 1 ; imethod= 3 ; iorder= 2 ; iprecon= 1 ; iotatol= -1.00000E+00 ;
readin :            : Lextrap= 0 ; Mregular= -1 ;
readin :            : 
readin :       0.00 : LBeltrami= 4 ; Linitgues= 1 ;
readin :            : 
readin :       0.00 : Lfindzero= 2 ;
readin :            : escale=  0.00000E+00 ; opsilon=  1.00000E+00 ; pcondense=  4.000 ; epsilon=  1.00000E-01 ; wpoloidal= 1.0000 ; upsilon=  1.00000E-01 ;
readin :            : forcetol=  1.00000E-12 ; c05xmax=  1.00000E-06 ; c05xtol=  1.00000E-12 ; c05factor=  1.00000E-04 ; LreadGF= F ; 
readin :            : mfreeits=   5 ; gBntol=  1.00000E-10 ; gBnbld=  0.00000E+00 ;
readin :            : vcasingeps=  1.00000E-12 ; vcasingtol=  1.00000E-10 ; vcasingits=  8 ; vcasingper=  1 ;
readin :            : 
readin :       0.00 : odetol=  1.00E-07 ; nPpts=   500 ;
readin :            : LHevalues= F ; LHevectors= F ; LHmatrix= F ; Lperturbed= 0 ; dpp= -1 ; dqq= -1 ; Lcheck=  1 ; Ltiming= F ;
readin :            : 
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
readin :            : myid=  0 ; Rscale= 3.999000000000000E+00 ;
preset :            : myid=  0 ; Mrad= 10 : Lrad=  8,  8,  8,  8,  8,  8,  8, 10,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :       0.01 : LBsequad= F , LBnewton= F , LBlinear= T ;
preset :            : 
preset :       0.02 : Nquad=  -1 ; mn=   17 ; NGdof=   231 ; NAdof=   341,   358,   358,   358,   358,   358,   358,   426,
preset :            : 
preset :       0.02 : Nt=   128 ; Nz=     1 ; Ntz=      128 ;
after preset
after init_outfile
after mirror_input
macros :       6.88 : myid=  5 ; iwork not allocated ;
macros :       6.89 : myid=  4 ; iwork not allocated ;
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 5 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[ellis001.pppl.gov:01081] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[ellis001.pppl.gov:01081] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

Fixed iwork error in tr00ab.f90, stuck at 'mirro_input' again (for both xspec and dspec).

czhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> mpiexec /p/focus/share/spec/dspec test_hdf5.sp
xspech :            : 
       :  compiled  : date    = Fri Jul 26 09:05:07 EDT 2019 ; 
       :            : srcdir  = /p/focus/share/spec ; 
       :            : macros  = macros ; 
       :            : fc      = mpifort ; 
       :            : flags   =  -r8 -check bounds -check format -check output_c
 onversion -check pointers -check uninit -debug full -D DEBUG ; 
xspech :            : 
xspech :       0.00 : begin execution ; ncpu=  8 ; calling global:readin ;
readin :            : 
readin :       0.00 : date=2019/07/26 , 09:12:45 ; machine precision= 1.11E-16 ; vsmall= 1.11E-14 ; small= 1.11E-12 ;
readin :            : 
readin :       0.00 : ext = test_hdf5                                                                                           
readin :            : 
readin :            : 
readin :       0.00 : Igeometry=  3 ; Istellsym=  1 ;
readin :            : Lfreebound=  1 ; phiedge=  1.000000000000000E+00 ; curtor= -1.190644396046887E-01 ; curpol=  4.881440045660601E+00 ;
readin :            : gamma=  0.000000000000000E+00 ;
readin :            : Nfp=  1 ; Nvol=  7 ; Mvol=  8 ; Mpol= 16 ; Ntor=  0 ;
readin :            : pscale=  1.55477E-07 ; Ladiabatic= 0 ; Lconstraint=  1 ; mupf: tol,its=  1.00E-12 , 128 ;
readin :            : Lrad =  8, 8, 8, 8, 8, 8, 8,10,
readin :            : 
readin :       0.00 : Linitialize=  1 ; Lzerovac= 0 ; Ndiscrete= 2 ;
readin :            : Nquad=  -1 ; iMpol=  -4 ; iNtor=  -4 ;
readin :            : Lsparse= 0 ; Lsvdiota= 1 ; imethod= 3 ; iorder= 2 ; iprecon= 1 ; iotatol= -1.00000E+00 ;
readin :            : Lextrap= 0 ; Mregular= -1 ;
readin :            : 
readin :       0.00 : LBeltrami= 4 ; Linitgues= 1 ;
readin :            : 
readin :       0.00 : Lfindzero= 2 ;
readin :            : escale=  0.00000E+00 ; opsilon=  1.00000E+00 ; pcondense=  4.000 ; epsilon=  1.00000E-01 ; wpoloidal= 1.0000 ; upsilon=  1.00000E-01 ;
readin :            : forcetol=  1.00000E-12 ; c05xmax=  1.00000E-06 ; c05xtol=  1.00000E-12 ; c05factor=  1.00000E-04 ; LreadGF= F ; 
readin :            : mfreeits=   5 ; gBntol=  1.00000E-10 ; gBnbld=  0.00000E+00 ;
readin :            : vcasingeps=  1.00000E-12 ; vcasingtol=  1.00000E-10 ; vcasingits=  8 ; vcasingper=  1 ;
readin :            : 
readin :       0.00 : odetol=  1.00E-07 ; nPpts=   500 ;
readin :            : LHevalues= F ; LHevectors= F ; LHmatrix= F ; Lperturbed= 0 ; dpp= -1 ; dqq= -1 ; Lcheck=  1 ; Ltiming= F ;
readin :            : 
readin :            : myid=  0 ; Rscale= 3.999000000000000E+00 ;
preset :            : myid=  0 ; Mrad= 10 : Lrad=  8,  8,  8,  8,  8,  8,  8, 10,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :            : sweight= 1.42857E-02, 2.85714E-02, 4.28571E-02, 5.71429E-02, 7.14286E-02, 8.57143E-02, 1.00000E-01, 1.14286E-01,
preset :       0.01 : LBsequad= F , LBnewton= F , LBlinear= T ;
preset :            : 
preset :       0.02 : Nquad=  -1 ; mn=   17 ; NGdof=   231 ; NAdof=   341,   358,   358,   358,   358,   358,   358,   426,
preset :            : 
preset :       0.02 : Nt=   128 ; Nz=     1 ; Ntz=      128 ;
after preset
after init_outfile

I even compiled both xspec and dspec on the computation node, but they are both stuck at mirror_input. That's so strange. It executed normally in the very beginning. After I re-compiled it, it never works again.

from spec.

zhucaoxiang commented on September 23, 2024

It works with one cpu mpirun -np 1 xspec .... With mpirun -np 2 xspec, it was stuck again. There is definitely something wrong with the subroutine mirror_input_to_outfile, I will check the subroutine directly.

czhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> mpirun -np 1 /p/focus/share/spec/xspec test_hdf5.sp
xspech :            : 
       :  compiled  : date    = Fri Jul 26 09:20:51 EDT 2019 ; 
       :            : srcdir  = /p/focus/share/spec ; 
       :            : macros  = macros ; 
       :            : fc      = mpifort ; 
       :            : flags   =  -r8 -mcmodel=large -O0 -m64 -unroll0 -fno-alias
  -ip -traceback ; 
xspech :            : 
xspech :       0.00 : begin execution ; ncpu=  1 ; calling global:readin ;
readin :            : 
readin :       0.00 : date=2019/07/26 , 09:26:48 ; machine precision= 1.11E-16 ; vsmall= 1.11E-14 ; small= 1.11E-12 ;
readin :            : 
readin :       0.00 : ext = test_hdf5                                                                                           
readin :            : 
readin :            : 
readin :       0.00 : Igeometry=  3 ; Istellsym=  1 ;
readin :            : Lfreebound=  1 ; phiedge=  1.000000000000000E+00 ; curtor= -1.190644396046887E-01 ; curpol=  4.881440045660601E+00 ;
readin :            : gamma=  0.000000000000000E+00 ;
readin :            : Nfp=  1 ; Nvol=  7 ; Mvol=  8 ; Mpol= 16 ; Ntor=  0 ;
readin :            : pscale=  1.55477E-07 ; Ladiabatic= 0 ; Lconstraint=  1 ; mupf: tol,its=  1.00E-12 , 128 ;
readin :            : Lrad =  8, 8, 8, 8, 8, 8, 8,10,
readin :            : 
readin :       0.00 : Linitialize=  1 ; Lzerovac= 0 ; Ndiscrete= 2 ;
readin :            : Nquad=  -1 ; iMpol=  -4 ; iNtor=  -4 ;
readin :            : Lsparse= 0 ; Lsvdiota= 1 ; imethod= 3 ; iorder= 2 ; iprecon= 1 ; iotatol= -1.00000E+00 ;
readin :            : Lextrap= 0 ; Mregular= -1 ;
readin :            : 
readin :       0.00 : LBeltrami= 4 ; Linitgues= 1 ;
readin :            : 
readin :       0.00 : Lfindzero= 2 ;
readin :            : escale=  0.00000E+00 ; opsilon=  1.00000E+00 ; pcondense=  4.000 ; epsilon=  1.00000E-01 ; wpoloidal= 1.0000 ; upsilon=  1.00000E-01 ;
readin :            : forcetol=  1.00000E-12 ; c05xmax=  1.00000E-06 ; c05xtol=  1.00000E-12 ; c05factor=  1.00000E-04 ; LreadGF= F ; 
readin :            : mfreeits=   5 ; gBntol=  1.00000E-10 ; gBnbld=  0.00000E+00 ;
readin :            : vcasingeps=  1.00000E-12 ; vcasingtol=  1.00000E-10 ; vcasingits=  8 ; vcasingper=  1 ;
readin :            : 
readin :       0.00 : odetol=  1.00E-07 ; nPpts=   500 ;
readin :            : LHevalues= F ; LHevectors= F ; LHmatrix= F ; Lperturbed= 0 ; dpp= -1 ; dqq= -1 ; Lcheck=  1 ; Ltiming= F ;
readin :            : 
readin :            : myid=  0 ; Rscale= 3.999000000000000E+00 ;
preset :            : myid=  0 ; Mrad= 10 : Lrad=  8,  8,  8,  8,  8,  8,  8, 10,
preset :       0.01 : LBsequad= F , LBnewton= F , LBlinear= T ;
preset :            : 
preset :       0.01 : Nquad=  -1 ; mn=   17 ; NGdof=   231 ; NAdof=   341,   358,   358,   358,   358,   358,   358,   426,
preset :            : 
preset :       0.01 : Nt=   128 ; Nz=     1 ; Ntz=      128 ;
after preset
after init_outfile
after mirror_input
ma02aa :       6.01 : myid=  0 ; lvol=  8 ; Linear : ihybrj =  5 hel=  1.0095E+05 mu=  0.0000E+00 dpflux= -2.8167E-02 time=      0.5 ; bad progress     ; F=  2.E-01 -1.E-13
newton :       6.01 :         0  0 ; |f|= 3.84584E-03 ; time=      5.61s ; log|BB|e= -5.11 -4.96 -4.83 -4.68 -4.52 -4.38 -4.18
newton :            :              ;                                     ; log|II|o= -4.27 -3.45 -3.03 -2.81 -2.50 -2.27 -2.88
ma02aa :      10.70 : myid=  0 ; lvol=  8 ; Linear : ihybrj =  5 hel=  1.0095E+05 mu=  0.0000E+00 dpflux= -2.8167E-02 time=      0.5 ; bad progress     ; F=  2.E-01 -1.E-13
^Cczhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> 
czhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> 
czhu@ellis001:/p/focus/users/czhu/SPEC/test/SOLOVEV/paper> mpirun -np 2 /p/focus/share/spec/xspec test_hdf5.sp
xspech :            : 
       :  compiled  : date    = Fri Jul 26 09:20:51 EDT 2019 ; 
       :            : srcdir  = /p/focus/share/spec ; 
       :            : macros  = macros ; 
       :            : fc      = mpifort ; 
       :            : flags   =  -r8 -mcmodel=large -O0 -m64 -unroll0 -fno-alias
  -ip -traceback ; 
xspech :            : 
xspech :       0.00 : begin execution ; ncpu=  2 ; calling global:readin ;
readin :            : 
readin :       0.00 : date=2019/07/26 , 09:28:19 ; machine precision= 1.11E-16 ; vsmall= 1.11E-14 ; small= 1.11E-12 ;
readin :            : 
readin :       0.00 : ext = test_hdf5                                                                                           
readin :            : 
readin :            : 
readin :       0.00 : Igeometry=  3 ; Istellsym=  1 ;
readin :            : Lfreebound=  1 ; phiedge=  1.000000000000000E+00 ; curtor= -1.190644396046887E-01 ; curpol=  4.881440045660601E+00 ;
readin :            : gamma=  0.000000000000000E+00 ;
readin :            : Nfp=  1 ; Nvol=  7 ; Mvol=  8 ; Mpol= 16 ; Ntor=  0 ;
readin :            : pscale=  1.55477E-07 ; Ladiabatic= 0 ; Lconstraint=  1 ; mupf: tol,its=  1.00E-12 , 128 ;
readin :            : Lrad =  8, 8, 8, 8, 8, 8, 8,10,
readin :            : 
readin :       0.00 : Linitialize=  1 ; Lzerovac= 0 ; Ndiscrete= 2 ;
readin :            : Nquad=  -1 ; iMpol=  -4 ; iNtor=  -4 ;
readin :            : Lsparse= 0 ; Lsvdiota= 1 ; imethod= 3 ; iorder= 2 ; iprecon= 1 ; iotatol= -1.00000E+00 ;
readin :            : Lextrap= 0 ; Mregular= -1 ;
readin :            : 
readin :       0.00 : LBeltrami= 4 ; Linitgues= 1 ;
readin :            : 
readin :       0.00 : Lfindzero= 2 ;
readin :            : escale=  0.00000E+00 ; opsilon=  1.00000E+00 ; pcondense=  4.000 ; epsilon=  1.00000E-01 ; wpoloidal= 1.0000 ; upsilon=  1.00000E-01 ;
readin :            : forcetol=  1.00000E-12 ; c05xmax=  1.00000E-06 ; c05xtol=  1.00000E-12 ; c05factor=  1.00000E-04 ; LreadGF= F ; 
readin :            : mfreeits=   5 ; gBntol=  1.00000E-10 ; gBnbld=  0.00000E+00 ;
readin :            : vcasingeps=  1.00000E-12 ; vcasingtol=  1.00000E-10 ; vcasingits=  8 ; vcasingper=  1 ;
readin :            : 
readin :       0.00 : odetol=  1.00E-07 ; nPpts=   500 ;
readin :            : LHevalues= F ; LHevectors= F ; LHmatrix= F ; Lperturbed= 0 ; dpp= -1 ; dqq= -1 ; Lcheck=  1 ; Ltiming= F ;
readin :            : 
readin :            : myid=  0 ; Rscale= 3.999000000000000E+00 ;
preset :            : myid=  0 ; Mrad= 10 : Lrad=  8,  8,  8,  8,  8,  8,  8, 10,
preset :       0.00 : LBsequad= F , LBnewton= F , LBlinear= T ;
preset :            : 
preset :       0.01 : Nquad=  -1 ; mn=   17 ; NGdof=   231 ; NAdof=   341,   358,   358,   358,   358,   358,   358,   426,
preset :            : 
preset :       0.01 : Nt=   128 ; Nz=     1 ; Ntz=      128 ;
after preset
after init_outfile

from spec.

zhucaoxiang commented on September 23, 2024

The reason was identified to be PPPL cluster issue. Somehow, with the latest intel or gfortran compiler, the hdf5-parallel cannot be executed normally. I used intel/2018 on PPPL cluster and this error is finally gone.

I have committed some other fixes into the branch issue68.

from spec.

jonathanschilling commented on September 23, 2024

There are some output files which are still written in the old output format:

dunit .ext.sp.DF (derivative matrix)
hunit .ext.GF.ev (eigenvalues of Hessian)
munit .ext.GF.ma (matrix elements of Hessian)
lunit ext.Vn (normal component of vacuum field)

I have not seen matlab routines for reading/writing these, so I left them out of the new output format so far. As my trip to PPPL comes closer, I would like to shift porting these inputs/outputs to the new file format to the future, when the actual necessity for them arises.

I will therefore issue the pull request now.

from spec.

zhisong commented on September 23, 2024

Thank you @jonathanschilling
I will start to port the Zernike branch to the new version.

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling I observed two things by testing with SPEC/InputFiles/TestCases/G3V02L1Fi.001.sp

More CPUs used, slower in writing outputs
This might be not surprising with parallel HDF5. But it might also caused by something we haven't identified. Here are the time printed for the first newton iteration.
- 1 CPU: 2.84 sec
- 2 CPUs: 3.00 sec
- 16 CPUs: 25.15 sec

preset :       0.02 : Nquad=  -1 ; mn=   41 ; NGdof=    81 ; NAdof=   497,   534,
preset :            : 
preset :       0.02 : Nt=    32 ; Nz=    32 ; Ntz=     1024 ;
           8 I'm here.
           6 I'm here.
           4 I'm here.
          12 I'm here.
           2 I'm here.
           0 I'm here.
           9 I'm here.
          10 I'm here.
           5 I'm here.
           3 I'm here.
          13 I'm here.
           1 I'm here.
           7 I'm here.
          11 I'm here.
          14 I'm here.
          15 I'm here.
newton :      25.15 :         0  0 ; |f|= 1.48437E-02 ; time=      1.19s ; log|BB|e= -4.66
newton :            :              ;                                     ; log|II|o= -2.42

For some reason, without above print operations, XSPEC got stuck or extremely slow
I put a trivial print operation at line 136 in xspech.f90 after calling mirror_input_to_output. It works as the above outputs. But once I commented it out, it got stuck (at least I waited for 5 minutes and it didn't print newton outputs).

  WCALL( xspech, mirror_input_to_outfile ) ! mirror input file contents to output file

  !!print *, myid, "I'm here."

I am using intel compiler at PPPL cluster with 16 CPUs. Does anyone have similar issues? @zhisong @jloizu @abaillod

from spec.

zhisong commented on September 23, 2024

Similar things happen for me. However, SPEC is parallelized over volumes. It is unclear to me what SPEC will do if we assign 3 cpus to 2 volumes. Will the additional cpu do anything or get stuck because we haven't told it what to do?

from spec.

jonathanschilling commented on September 23, 2024

Hmm, this is really weird.
I checked on IPP's draco cluster with the intel compiler and the following modules:

jons@draco03:~/src/SPEC> module list
Currently Loaded Modulefiles:
  1) intel/18.0.3      2) impi/2018.3       3) mkl/2018.3        4) git/2.16          5) fftw-mpi/3.3.8    6) hdf5-mpi/1.10.5

I cannot observe the scaling you see; neither in interactive run mode nor from within the SLURM batch system.
One thing I noticed is that in the default section of the Makefile, you use mpif90 as the default for FC. On draco, this macro is mapped to gfortran including the MPI-relevant flags. I had to overwrite this to mpiifort in the intel_ipp section of the Makefile in order to get it to work.

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang I digged a little deeper and I was able to reproduce this behavior on my Laptop (gfortran_arch section in the Makefile).
However, it seems that the delay is coming from the call to dforce in newton.f90:148.
As long as I do not assign more CPUs than there are volumes in the run, all is ok though.
So we might need to keep searching in the direction @zhisong pointed out above.

from spec.

zhucaoxiang commented on September 23, 2024

@jonathanschilling I agree that we need to check the parallelization part, but I think I also have some issues with parallel HDF5. I cannot jump out mirror_input_to_outfile with multiple CPUs. When you have time, please stop by my office at B348.

from spec.

jonathanschilling commented on September 23, 2024

@zhucaoxiang I safeguarded the write operations in the HWRITE... macros so that they should only be called by MPI ID 0. Could you please pull issue68 and check if it works now?

from spec.

jonathanschilling commented on September 23, 2024

Sorry for being absent for so long.
I will work on bringing the documentation of the new HDF5 output up to date.
Regarding the remaining differences (#78), I would like to suggest expanding the TestCases and Verification folders into self-contained scripts which can be run every time before a new version of SPEC is released. We might have to exclude the computationally very expensive runs from this, but even the small and fast-to-execute cases should already give confidence that nothing is going spectaculary wrong in the new release.

from spec.

jonathanschilling commented on September 23, 2024

The HDF5 output file format has been in use now for quite some time and I believe that we have found and fixed most issues with in.

I am therefore closing this issue.

from spec.

All output quantities into one HDF5 file about spec HOT 35 CLOSED

Comments (35)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent