Giter Club home page Giter Club logo

Comments (22)

aharwood2 avatar aharwood2 commented on July 23, 2024

Your HDF5 config is OK, the problem is that you are using a parallel environment (mpirun etc.) to run a serial build of LUMA. You want to compile LUMA with the BUILD_FOR_MPI definition uncommented in your definitions file.

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

I didn't change anything in the definitions.h file. I attach the definition and geometry config files.
definitions.zip

geometry.zip

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

That is strange. The parallel flag is indeed defined but the executable produced is clearly a serial build as indicated by the fact that each process tries to run its own LUMA instance and the text "(Serial Build)" is printed. Can you also send through your log files from the output directory and also try to make clean && make again and send us any output from the build process?

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

Well, I've done make clean && make and replaced definitions.h and geometry.config with ones in the test case section. After that, I ran into another problem (even for a serial process):
Screenshot from 2020-01-07 14-21-21

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

Can anyone help?

Well, I've done make clean && make and replaced definitions.h and geometry.config with ones in the test case section. After that, I ran into another problem (even for a serial process):
Screenshot from 2020-01-07 14-21-21

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

Two things:

  1. Can you send us the log file in the output directory.
  2. Can you zip up your whole code directory and send it to us and then we can try to reproduce.

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

Well, after make clean & make I can't run LUMA even in serial mode. So, no output directory is created. I couldn't upload files larger than 10 MB so i split zip file using zip -r -s.... It created LUMA.z01, LUMA.z02, ..., LUMA.z05 along with a LUMA.zip. I couldn't upload a file unless it was a zip file. So, I rename LUMA.z01 to LUMA.z01.zip and so on. To extract, you need to remove ".zip" at the end of the file name. Here is my whole code directory:
LUMA.zip
LUMA.z05.zip
LUMA.z04.zip
LUMA.z03.zip
LUMA.z02.zip
LUMA.z01.zip

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

Unfortunately some of these files are corrupt when I download them. You actually only need to send me the inc/*, src/* and input/geometry.config files which are not very big at all. Would you mind zipping just these up and sending them through?

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

Here are the files:
Luma.zip
geometry.zip

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

I resolved the issue of not running LUMA (even in serial mode) by downloading all the package again and run make command. Some of previous files might have been corrupted and so, make clean & make didn't solve the issue. The parallel processing mode issue, however, remained unsolved. Here are my src/*, inc/* and input/geometry.config files:
LUMA.zip
geometry.zip
I have also run the command mpirun -np 2 ./LUMA and here is the .log file:
log_file.zip

Can anyone help?

Well, I've done make clean && make and replaced definitions.h and geometry.config with ones in the test case section. After that, I ran into another problem (even for a serial process):
Screenshot from 2020-01-07 14-21-21

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

Thank you for attaching your files. I unzipped them and built it using make exactly as you sent them to me and ran the resulting executable in serial without issue. I then cleaned and opened inc/definitions.h and uncommented the L_BUILD_FOR_MPI line and built using make. As you specified L_XCORES 4 and L_YCORES 2 for a 2D problem, I ran the resulting executable using 8 cores (4 x 2) using the command mpirun -np 8 ./LUMA and I get the following error when the software calls HDF5 which I believe is what you are also seeing.

$ mpirun -np 8 ./LUMA
Running LUMA -- Version 1.7.9
(Parallel Build: 8 Processes)
>> *** An error occurred in MPI_Comm_create_keyval
>> *** reported by process [337444865,3]
>> *** on communicator MPI_COMM_WORLD
>> *** MPI_ERR_ARG: invalid argument of some other kind
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> ***    and potentially your MPI job)
>> 7 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>> 7 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal

It is possible that this is related to the configuration of HDF5. We are using 1.10.0-patch1. Can you confirm your HDF5 version? We will investigate further and let you know on this ticket when there is more information available. Thank you for your patience and thank you for reporting this.

In the meantime, you might want to continue to use LUMA in serial.

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

One possibility is that open-mpi and mpich don't play well together, can you confirm which ones you have installed?

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

Finally, the output of h5cc -showconfig might also be useful to us.

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

OK, looks like this was caused by HDF5 using the wrong mpio implementation. In your makefile, rather than using include and library paths which correspond to the mpich implementation, use open-mpi instead.

Can you edit the makefile vim makefile and change the following lines (or similar):

INC=-I/usr/include/hdf5/mpich
LIB=-L/usr/lib/x86_64-linux-gnu/hdf5/mpich -lhdf5 -llapack

to

INC=-I/usr/include/hdf5/openmpi
LIB=-L/usr/lib/x86_64-linux-gnu/hdf5/openmpi -lhdf5 -llapack

then clean, rebuild and run: make clean && make && mpirun -np 8 ./LUMA

This fixes the error for us and runs without issue. If you can confirm this fixes the issue I will close the issue.

from luma.

H0sseini avatar H0sseini commented on July 23, 2024

Thank you for your response. I have as you instructed. I edited the makefile and also ran make clean && make. But when I ran mpirun -np 8 ./LUMA, I got the same error. Here are my makefile and .log files:
makefile.zip
log.zip

from luma.

huadongy avatar huadongy commented on July 23, 2024

I got the same error.

When LUMA is freshly installed without running LUMA/cases/testsuite/scripts/runTestSuite.sh, the compilation with make works.

After running runTestSuite.sh, a re-compilation with 'make clean; make' will show the above problem.

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

It is possible that the runTestSuite.sh script exports some local variables which are then used by the make file hence changing its behaviour? You should be able to edit the make file to force it to use what you want as per the above?

from luma.

ianhinder avatar ianhinder commented on July 23, 2024

Processes in Linux cannot change their parent's environment variables, so running runTestSuite.sh cannot modify the environment that make sees.

As far as I can tell, running runTestSuite.sh overwrites the definitions.h file in the inc directory with those of the test cases, without putting it back again afterwards.

So if compilation works from a clean checkout, but fails after running the test suite, it's probably because the definitions.h file has changed. You can restore it from the fresh installation (if it's checked out with git, you can do git checkout inc/definitions.h) and try the make again.

from luma.

ianhinder avatar ianhinder commented on July 23, 2024

I got the same error.

@huadongy: there are several errors reported above; please can you tell us exactly which error you are getting? The best thing would be if you could tell us all the steps you have taken since a clean checkout, and what error you get. Thanks!

from luma.

huadongy avatar huadongy commented on July 23, 2024

Please find the error message in the attached file. In fact, I did nothing special -- only downloaded the software, and compile with 'make'.

log.error.txt

from luma.

ianhinder avatar ianhinder commented on July 23, 2024

Hi @huadongy, I've looked at the log.error.txt file that you sent, but the error messages look completely different to the ones we have been talking about in this issue. Your first error is

src/../inc/GridManager.h:59:21: error: ‘L_NUM_LEVELS’ was not declared in this scope
  int global_size[3][L_NUM_LEVELS * L_NUM_REGIONS + 1];
                     ^~~~~~~~~~~~

Is that correct? It sounds like a separate problem. A wild guess would be that you have modified the definitions.h file in some way, or it has been modified by the test system. This github issue has now become very confusing; unless you are getting the same error message, please can I ask you to open a new issue? In that issue, it would be good to detail all the steps you have taken. For example, what OS are you using? Which version did you download? Have you installed all the prerequisites according to the instructions on the wiki? What is the sequence of commands that you ran to get the above errors? As Adrian said, it works for us, so we need to know these details to know what you are doing differently. Thanks!

from luma.

aharwood2 avatar aharwood2 commented on July 23, 2024

Closing as new issue moved to new tag.

from luma.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.