Giter Club home page Giter Club logo

Comments (7)

nikizadehgfdl avatar nikizadehgfdl commented on August 21, 2024

BTW the model does reproduce across a fv_layout change, atmos_threads change or ocean_layout change (with no mask_table).

from icebergs.

nikizadehgfdl avatar nikizadehgfdl commented on August 21, 2024

@underwoo wrote:
"Looking at the stdout's from the CM4_c96L32_am4g5r2_2000_sis2 runs, the "Total Ice Mass|Salt|Heat" are all slightly different from the first print out. Points to something in the iceberg initialization. (The noberg runs do not show the difference in "Total Ice ..".)

Also, I don't see any iceberg restart files in the initCond file. Could you please run a test that uses iceberg restart files to see if that will reproduce across layout changes."

So, I did try that and the answers indeed reproduced across the same ice_layout change!

/// /lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond/ncrc2.intel-repro-openmp/archive/1x0m10d_2560pe/restart/00010121.tar
\\\ /lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond/ncrc2.intel-repro-openmp/archive/1x0m10d_2561pe/restart/00010121.tar

the only difference was
      Comparing icebergs.res.nc...
DIFFER : VARIABLE : lon : POSITION : 0 : VALUES : -265.452 <> -267.218

All I did was to use one of the restart tars from a 10 days experiment (the 2560 one) as the initCond and repeat the runs.

Here are the stdouts:

/lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond_1x0m10d_2560pe.o5041290 

/lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond_1x0m10d_2561pe.o5041289

So, Seth, what do you make of this?

from icebergs.

jwdGFDL avatar jwdGFDL commented on August 21, 2024

If I recall correctly, the initialization algorithm picks out the
icebergs a given rank owns. One possibility is that there is a layout
dependent flaw that may attribute an iceberg to multiple ranks or to no
rank thus leaving it out of further simulation.

On 08/12/2015 07:14 PM, Niki Zadeh wrote:

@underwoo https://github.com/underwoo wrote:
"Looking at the stdout's from the CM4_c96L32_am4g5r2_2000_sis2 runs,
the "Total Ice Mass|Salt|Heat" are all slightly different from the
first print out. Points to something in the iceberg initialization.
(The noberg runs do not show the difference in "Total Ice ..".)

Also, I don't see any iceberg restart files in the initCond file.
Could you please run a test that uses iceberg restart files to see if
that will reproduce across layout changes."

So, I did try that and the answers indeed reproduced across the same
ice_layout change!

|///
/lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond/ncrc2.intel-repro-openmp/archive/1x0m10d_2560pe/restart/00010121.tar
\
/lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2_withBergsInInitcond/ncrc2.intel-repro-openmp/archive/1x0m10d_2561pe/restart/00010121.tar
the only difference was Comparing icebergs.res.nc... DIFFER : VARIABLE
: lon : POSITION : 0 : VALUES : -265.452 <> -267.218 |

All I did was to use one of the restart tars from a 10 days experiment
(the 2560 one) as the initCond and repeat the runs.

So, Seth, what do you make of this?


Reply to this email directly or view it on GitHub
#13 (comment).

Jeff Durachta
Engineering Lead for Modeling Services
NOAA Geophysical Fluid Dynamics Lab
Forrestal Campus, Princeton University
201 Forrestal Road
Princeton, NJ 08540
Office: +1-609-987-5054

from icebergs.

adcroft avatar adcroft commented on August 21, 2024

I've been unable to make an ice-ocean configuration fail reproducibility tests in which I seed every model cell with four bergs moving in the cardinal directions.

Looking at the logs @nikizadehgfdl provided it looks like there is a difference in the calving restart checksum. How does this happen?

> grep restart_calv /lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2560pe.o7149337 /lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2561pe.o7149320
CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2560pe.o7149337:diamonds, grd_chksum3: read_restart_calvi chksum=           -1896008147 chksum2=           -1545844752 min= 0.000000000E+00 max= 7.399996075E+11 mean= 9.751374716E+10 rms= 1.634493874E+11 sd= 1.311745835E+11
CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2561pe.o7149320:diamonds, grd_chksum3: read_restart_calvi chksum=            -185424424 chksum2=            1140423430 min= 0.000000000E+00 max= 7.399992858E+11 mean= 9.750156607E+10 rms= 1.634237307E+11 sd= 1.311516694E+11

There is also this line:

< OCN(ATMOCNLND)=  0.354793438964402       0.354793438964402    0.354793438964402
> OCN(ATMOCNLND)=  0.354433472151885       0.354433472151885    0.354433472151885

which has nothing todo with icebergs.

from icebergs.

Zhi-Liang avatar Zhi-Liang commented on August 21, 2024

Hi Niki,

< OCN(ATMOCNLND)= 0.354793438964402 0.354793438964402
0.354793438964402

OCN(ATMOCNLND)= 0.354433472151885 0.354433472151885
0.354433472151885

This printout is from xgrid.F90. This caculation is based on some random
number. So it can not reproduce between processor count.

Zhi

On Tue, Aug 18, 2015 at 10:02 AM, Alistair Adcroft <[email protected]

wrote:

I've been unable to make an ice-ocean configuration fail
reproducibility tests in which I seed every model cell with four bergs
moving in the cardinal directions.

Looking at the logs @nikizadehgfdl https://github.com/nikizadehgfdl
provided it looks like there is a difference in the calving restart
checksum. How does this happen?

grep restart_calv /lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2560pe.o7149337 /lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2561pe.o7149320
CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2560pe.o7149337:diamonds, grd_chksum3: read_restart_calvi chksum= -1896008147 chksum2= -1545844752 min= 0.000000000E+00 max= 7.399996075E+11 mean= 9.751374716E+10 rms= 1.634493874E+11 sd= 1.311745835E+11
CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2561pe.o7149320:diamonds, grd_chksum3: read_restart_calvi chksum= -185424424 chksum2= 1140423430 min= 0.000000000E+00 max= 7.399992858E+11 mean= 9.750156607E+10 rms= 1.634237307E+11 sd= 1.311516694E+11

There is also this line:

< OCN(ATMOCNLND)= 0.354793438964402 0.354793438964402 0.354793438964402

OCN(ATMOCNLND)= 0.354433472151885 0.354433472151885 0.354433472151885

which has nothing todo with icebergs.


Reply to this email directly or view it on GitHub
#13 (comment).

from icebergs.

underwoo avatar underwoo commented on August 21, 2024

There is a namelist options 'make_calving_reproduce' in the ice_sis version
of ice_bergs. Niki, please check if this option is in the new icebergs,
and if it is set to .true. in your namelists.

Seth Underwood
Engility

Modeling Systems Group
GFDL/NOAA/DOC
201 Forrestal Road
Princeton, NJ 08540-6649

(609) 452-5847 Office
(304) 376-9002 Cell
(609) 987-5063 Fax
[email protected]

On Tue, Aug 18, 2015 at 10:09 AM, Zhi Liang [email protected]
wrote:

Hi Niki,

< OCN(ATMOCNLND)= 0.354793438964402 0.354793438964402
0.354793438964402

OCN(ATMOCNLND)= 0.354433472151885 0.354433472151885
0.354433472151885

This printout is from xgrid.F90. This caculation is based on some random
number. So it can not reproduce between processor count.

Zhi

On Tue, Aug 18, 2015 at 10:02 AM, Alistair Adcroft <
[email protected]

wrote:

I've been unable to make an ice-ocean configuration fail
reproducibility tests in which I seed every model cell with four bergs
moving in the cardinal directions.

Looking at the logs @nikizadehgfdl https://github.com/nikizadehgfdl
provided it looks like there is a difference in the calving restart
checksum. How does this happen?

grep restart_calv
/lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2560pe.o7149337
/lustre/f1/Niki.Zadeh/ulm_201505_awg_v20150702_mom6sis2_2015.08.06b/CM4_c96L32_am4g5r2_2000_sis2/ncrc2.intel-repro-openmp/stdout/run/CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2561pe.o7149320
CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2560pe.o7149337:diamonds,
grd_chksum3: read_restart_calvi chksum= -1896008147 chksum2= -1545844752
min= 0.000000000E+00 max= 7.399996075E+11 mean= 9.751374716E+10 rms=
1.634493874E+11 sd= 1.311745835E+11
CM4_c96L32_am4g5r2_2000_sis2_1x0m10d_2561pe.o7149320:diamonds,
grd_chksum3: read_restart_calvi chksum= -185424424 chksum2= 1140423430 min=
0.000000000E+00 max= 7.399992858E+11 mean= 9.750156607E+10 rms=
1.634237307E+11 sd= 1.311516694E+11

There is also this line:

< OCN(ATMOCNLND)= 0.354793438964402 0.354793438964402 0.354793438964402

OCN(ATMOCNLND)= 0.354433472151885 0.354433472151885 0.354433472151885

which has nothing todo with icebergs.


Reply to this email directly or view it on GitHub
<#13 (comment)
.


Reply to this email directly or view it on GitHub
#13 (comment).

from icebergs.

nikizadehgfdl avatar nikizadehgfdl commented on August 21, 2024

Thanks, that was the problem. The model reproduced across ice_layout change after I set the iceberg namelist make_calving_reproduce = .true.

from icebergs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.