Giter Club home page Giter Club logo

Comments (7)

lizziel avatar lizziel commented on August 23, 2024

subsituting with intel compiler and intelmpi would pass this step but end up crashing with a segmentation fault later(guess because of outdated ESMF v8.3.1)

I think we should pursue trying to get this configuration to work. Why do you think the seg fault is happening later due to outdated ESMF? Is there an ESMF error message?

If you can get it working with this configuration of libraries then at least you will have it running. We can then go back and figure out the issues with GNU/OpenMPI, potentially creating a github issue with the MAPL developers.

from gchp.

yuanjianz avatar yuanjianz commented on August 23, 2024

@lizziel Sure. Shall I continue here or open another issue? I am not sure about it is about ESMF. Maybe we can find some hints out of the log file.
gchp.20190701_0000z.log
runjob.log

It seems like a Cloud-J problem. I tried to identify the line the logger tracebacked and it was just a common loop I think.

==== backtrace (tid:     67) ====
 0 0x00000000016a2ba0 fjx_sub_mod_mp_blkslv_()  /Projects/GCHP/14.3/src/GCHP_GridComp/Cloud-J/src/Core/fjx_sub_mod.f90:1355
1355       do K = 1,W_+W_r
1356       if (LDOKR(K) .gt. 0) then
1357        call GEN_ID (POMEGA(1,1,K),FZ(1,K),ZTAU(1,K),FSBOT(K),RFL(1,K), &
1358              PM,PM0, B(1,1,1,K),CC(1,1,1,K),AA(1,1,1,K), &
1359                      A(1,1,K),H(1,1,K),C(1,1,K), ND)
1360       endif
1361       enddo

Intel compiler:19.1.0.166 20191121
Intelmpi:Version 2019 Update 6 Build 20191024
ESMF:8.3.1

from gchp.

lizziel avatar lizziel commented on August 23, 2024

Yes, please make a new issue for this. Thanks!

from gchp.

lizziel avatar lizziel commented on August 23, 2024

Hi @yuanjianz and @yidant, has the issue with GNU and OpenMPI been resolved?

from gchp.

yuanjianz avatar yuanjianz commented on August 23, 2024

Hi @lizziel, I tested it just now. It is still hanging at the

Bootstrapping Variable: T_PREVDAY in gchp_restart.nc4

when using more 1 node (the exact test scenario is 48x2=c96 cores C30 GEOSIT native mass flux).

I ran the GEOS-IT and MERRA-2 C24 benchmark with 72 cores on a single node with GNU successfully yesterday, so I assume it could be a MPI issue.

from gchp.

yuanjianz avatar yuanjianz commented on August 23, 2024

For an update (2024.3.16),

I am closing this issue because I found the issue related to old version of openmpi. The GCHP official docker geoschem/gchp:14.3.0 currently utilizes openmpi 3.0.5, which does not match the recommended openmpi version >= 4 in the official documentation.

I mannually update it to 4.1.1 and MPI jobs' performance significantly enhances. I will be working with @yidant to update official docker.

from gchp.

lizziel avatar lizziel commented on August 23, 2024

Ah, excellent. Yes, I think we had to update to OpenMPI 4.0 quite a while ago.

from gchp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.