Comments (7)
subsituting with intel compiler and intelmpi would pass this step but end up crashing with a segmentation fault later(guess because of outdated ESMF v8.3.1)
I think we should pursue trying to get this configuration to work. Why do you think the seg fault is happening later due to outdated ESMF? Is there an ESMF error message?
If you can get it working with this configuration of libraries then at least you will have it running. We can then go back and figure out the issues with GNU/OpenMPI, potentially creating a github issue with the MAPL developers.
from gchp.
@lizziel Sure. Shall I continue here or open another issue? I am not sure about it is about ESMF. Maybe we can find some hints out of the log file.
gchp.20190701_0000z.log
runjob.log
It seems like a Cloud-J problem. I tried to identify the line the logger tracebacked and it was just a common loop I think.
==== backtrace (tid: 67) ====
0 0x00000000016a2ba0 fjx_sub_mod_mp_blkslv_() /Projects/GCHP/14.3/src/GCHP_GridComp/Cloud-J/src/Core/fjx_sub_mod.f90:1355
1355 do K = 1,W_+W_r
1356 if (LDOKR(K) .gt. 0) then
1357 call GEN_ID (POMEGA(1,1,K),FZ(1,K),ZTAU(1,K),FSBOT(K),RFL(1,K), &
1358 PM,PM0, B(1,1,1,K),CC(1,1,1,K),AA(1,1,1,K), &
1359 A(1,1,K),H(1,1,K),C(1,1,K), ND)
1360 endif
1361 enddo
Intel compiler:19.1.0.166 20191121
Intelmpi:Version 2019 Update 6 Build 20191024
ESMF:8.3.1
from gchp.
Yes, please make a new issue for this. Thanks!
from gchp.
Hi @yuanjianz and @yidant, has the issue with GNU and OpenMPI been resolved?
from gchp.
Hi @lizziel, I tested it just now. It is still hanging at the
Bootstrapping Variable: T_PREVDAY in gchp_restart.nc4
when using more 1 node (the exact test scenario is 48x2=c96 cores C30 GEOSIT native mass flux).
I ran the GEOS-IT and MERRA-2 C24 benchmark with 72 cores on a single node with GNU successfully yesterday, so I assume it could be a MPI issue.
from gchp.
For an update (2024.3.16),
I am closing this issue because I found the issue related to old version of openmpi. The GCHP official docker geoschem/gchp:14.3.0 currently utilizes openmpi 3.0.5, which does not match the recommended openmpi version >= 4 in the official documentation.
I mannually update it to 4.1.1 and MPI jobs' performance significantly enhances. I will be working with @yidant to update official docker.
from gchp.
Ah, excellent. Yes, I think we had to update to OpenMPI 4.0 quite a while ago.
from gchp.
Related Issues (20)
- Feature request: Update the docs/requirements.yml to avoid security issues w/ old python versions (used for ReadTheDocs) HOT 1
- Problem about compilation of GCHP HOT 11
- Wrong units for SLP and TROPP in preprocessed GEOS-IT ExtData.rc HOT 2
- Constant value for all grid boxes in Passive Tracer monthly mean diagnostic HOT 1
- Single column/grid subset mode? HOT 6
- Access issue with restart files of GCHP 14.3.2 HOT 10
- Stretched Grid Runs Failing with "Error calling DO_WETDEP" HOT 7
- Create Run Directory Problem with 14.2.2 and 14.2.3 container HOT 4
- Using gcchem_internal_checkpoint for grid-stretching simulation but fails with 'Factories not equal' error. HOT 15
- GCHP simulation stopped after 7 month simulation (total set time for 1 yr) HOT 9
- Wrong surface type field when using raw meteorology file HOT 7
- [SUBMODULE UPDATES] MAPL fix for restarting with stretched grid checkpoints HOT 1
- GCHP carbon simulations with CH4 take much longer than simulations without CH4 HOT 14
- Using a new inventory in ExtData.rc HOT 8
- GCHP 14.3.1 out of memory when writing checkpoint files HOT 12
- Error with CMIP6 file HOT 26
- ERROR during 'Initialize' stage of the gridded component 'EXTDATA' HOT 6
- Transport Budget Diagnostics at surface level for GCHP14.3.1 C180 simulation are all zero HOT 7
- error running carbon simulation HOT 16
- Segmentation fault - invalid memory reference in mapl_capmod_MOD_run_model HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gchp.