Giter Club home page Giter Club logo

Comments (15)

yantosca avatar yantosca commented on August 23, 2024

Thanks for writing @yuz-wx. There was a similar issue #318 that was solved by using longitude in 0-360 range when specifying the target longitude. But I'm not sure that applies to your issue.

However, we just received a fix in GCPy (see geoschem/gcpy#311) for cubed sphere regridding. We will bring this into the dev branch soon. In the meantime you could pull the fix into your GCPy clone and then see if that results in a better-regridded restart file.

from gchp.

yuz-wx avatar yuz-wx commented on August 23, 2024

Dear yantosca, the restart file made by GCPY can be used successfully, but the GCHP generated restart file has such an issue. I think the cause of the issue is the inversion from double64 to real32. A double64 number is directly given to a real32 variable, which carries a truncation error because it is not the rounding number. I revised related lines in 'base/MAPL_CubedSphereGridFactory.F90' by an inelegant way to solve this problem temporarily and let the factories equal (it turns 0.645771756658875 to 0.6457718, instead of 0.6457717). But I think it might be better to compare the target_lat written in restart file and GCHP.rc directly (compare 37.0 with 37.0), instead of comparing the calculated one (compare 0.6457718 with 0.6457718). Hope it can be updated in the future. Thank you very much!

Here's my revised lines:
image

The result of my revise:
image

from gchp.

yantosca avatar yantosca commented on August 23, 2024

Thanks @yuz-wx. Tagging @lizziel and @yidant

from gchp.

lizziel avatar lizziel commented on August 23, 2024

Hi @yuz-wx, thanks for raising this issue! We have not had a report before of this problem but I can see how it might show up. Doing a check that the stretched grid attributes match up between restart and config is a relatively recent addition to MAPL. I will create an issue with NASA GMAO and also try to put a work-around into our MAPL fork in time for 14.4.0. I will link to the PR here once I have one.

from gchp.

yuz-wx avatar yuz-wx commented on August 23, 2024

Hi @lizziel, thank you for your comment! Look forward to the following update!

Here's my conclusion for advice:

  1. Just compare the value written in restart and config might be better, instead of the calculated values.
  2. If it must compare the calculated values, please focus on the precision of different values.
  3. I think using rounding value is not a good idea because when the results are 0.645771756658875 and 0.645771746658875, it will compare 0.6457718 and 0.6457717 and fails.

Here's the related lines in code:

  1. MAPL_Generic.F90: *"Factories not equal"
  2. MAPL_CubedSphereGridFactory.F90: function physical_params_are_equal
  3. MAPL_CubedSphereGridFactory.F90: subroutine check_and_fill_consistency

from gchp.

lizziel avatar lizziel commented on August 23, 2024

HI @yuz-wx, here is a quick fix for the stretched grid restart file issue: geoschem/MAPL#34

from gchp.

kilicomu avatar kilicomu commented on August 23, 2024

Hi @yantosca , @lizziel - I think I've found another way that this problem can sneak in!

@amy1916 was seeing the same issue as described by @yuz-wx so we read through this thread and applied the fix that Lizzie made to the MAPL submodule. We were still seeing the same Factories not equal error, and after some investigation we discovered that the number stored in the gcchem_internal_checkpoint file for the TARGET_LON attribute was not quite correct.

To demo this, let's take a stretched grid restart file that was created using GCPy regridding tools (FILE_1) and a checkpoint file produced by GCHP (FILE_2). If we ncdump the files and look at the attributes, we see:

FILE_1:

// global attributes:
                :STRETCH_FACTOR = 2.f ;
                :TARGET_LAT = 40.f ;
                :TARGET_LON = 285.f ;

FILE_2:

// global attributes:
                :STRETCH_FACTOR = 2.f ;
                :TARGET_LAT = 40.f ;
                :TARGET_LON = 285.f ;

All looks good, right?

Except, if we open both using xarray and take a look at the values of the attributes, we get:

In [6]: DATA.attrs['TARGET_LON'] == DATA2.attrs['TARGET_LON']
Out[6]: False

In [7]: DATA.attrs['TARGET_LON']
Out[7]: 285.0

In [8]: DATA2.attrs['TARGET_LON']
Out[8]: 284.99997

Just to sanity check against some kind of printing problem I modified initialize_from_file_metadata in MAPL_CubedSphereGridFactory to check the value of the TARGET_LON that is read in against the expected value and it confirmed that the value read in from the file is incorrect - I don't think any amount of precision changing down the line can guard that!

I haven't had a chance to dig into the checkpoint writing code to figure out why this number was getting into the checkpoint with an error, but something I did notice is that there are type differences between variables in FILE_1 and FILE_2, for example:

In [4]: DATA
Out[4]: 
<xarray.Dataset>
Dimensions:              (lat: 1200, lon: 200, lev: 72, time: 1)
Coordinates:
  * lev                  (lev) float64 1.0 2.0 3.0 4.0 ... 69.0 70.0 71.0 72.0
  * time                 (time) datetime64[ns] 2022-05-01
  * lat                  (lat) float64 0.0 1.0 2.0 ... 1.198e+03 1.199e+03
  * lon                  (lon) float64 0.0 1.0 2.0 3.0 ... 197.0 198.0 199.0
Data variables: (12/354)
    ARCHV_DRY_TOTN       (lat, lon) float32 ...
    ARCHV_WET_TOTN       (lat, lon) float32 ...
    AREA                 (lat, lon) float32 ...
    AeroH2O_SNA          (lev, lat, lon) float32 ...
    BXHEIGHT             (lev, lat, lon) float32 ...
    DELP_DRY             (lev, lat, lon) float32 ...
    ...                   ...
    SPC_pFe              (lev, lat, lon) float32 ...
    STATE_PSC            (lev, lat, lon) float32 ...
    T_DAVG               (lat, lon) float32 ...
    T_PREVDAY            (lat, lon) float32 ...
    TropLev              (lat, lon) float32 ...
    WetDepNitrogen       (lat, lon) float32 ...
Attributes:
    STRETCH_FACTOR:  2.0
    TARGET_LAT:      40.0
    TARGET_LON:      285.0

In [5]: DATA2
Out[5]: 
<xarray.Dataset>
Dimensions:         (lon: 200, lat: 1200, lev: 72, time: 1)
Coordinates:
  * lon             (lon) float64 1.0 2.0 3.0 4.0 ... 197.0 198.0 199.0 200.0
  * lat             (lat) float64 1.0 2.0 3.0 ... 1.198e+03 1.199e+03 1.2e+03
  * lev             (lev) float64 1.0 2.0 3.0 4.0 5.0 ... 69.0 70.0 71.0 72.0
  * time            (time) datetime64[ns] 2022-05-14
Data variables: (12/335)
    ARCHV_DRY_TOTN  (lat, lon) float32 ...
    ARCHV_WET_TOTN  (lat, lon) float32 ...
    AREA            (lat, lon) float64 ...
    AeroH2O_SNA     (lev, lat, lon) float64 ...
    BXHEIGHT        (lev, lat, lon) float64 ...
    DELP_DRY        (lev, lat, lon) float64 ...
    ...              ...
    SPC_pFe         (lev, lat, lon) float64 ...
    STATE_PSC       (lev, lat, lon) float32 ...
    T_DAVG          (lat, lon) float32 ...
    T_PREVDAY       (lat, lon) float32 ...
    TropLev         (lat, lon) float64 ...
    WetDepNitrogen  (lat, lon) float64 ...
Attributes:
    STRETCH_FACTOR:  2.0
    TARGET_LAT:      40.0
    TARGET_LON:      284.99997

which hints to me that somewhere the type of variables going into the checkpoint files is not being made explicit.

If I can find some time I'll get into the checkpoint writing code, but my GEOS-Chem time is quite thin at the moment!

Hope that's helpful, and if you think this would be more appropriate as a separate bug report I'll happily move it.

from gchp.

lizziel avatar lizziel commented on August 23, 2024

Thanks for reporting this still isn't working. I'll take a look at a more comprehensive fix. Indeed looks like we need to better track/preserve the type. This may already be fixed in a newer version of MAPL. I'll see what I can come up with.

from gchp.

lizziel avatar lizziel commented on August 23, 2024

I still have not been able to reproduce the issue. So far I have been able to submit sequential stretched grid runs without hitting the error. I'll try your exact stretch params and see if that makes a difference.

from gchp.

lizziel avatar lizziel commented on August 23, 2024

I created a stretched grid restart file using your parameters but still cannot reproduce the issue. Besides running for three consecutive simulations I opened each checkpoint and printed the stretch attributes.

import xarray as xr
ds0=xr.open_dataset('GEOSChem.Restart.20190101_0000z.c24.nc4')
ds1=xr.open_dataset('GEOSChem.Restart.20190101_0100z.c24.nc4')
ds2=xr.open_dataset('GEOSChem.Restart.20190101_0200z.c24.nc4')
ds3=xr.open_dataset('GEOSChem.Restart.20190101_0300z.c24.nc4')
ds0.attrs['TARGET_LON']
103.0
ds1.attrs['TARGET_LON']
103.0
ds2.attrs['TARGET_LON']
103.0
ds3.attrs['TARGET_LON']
103.0
ds0.attrs['TARGET_LAT']
37.0
ds1.attrs['TARGET_LAT']
37.0
ds2.attrs['TARGET_LAT']
37.0
ds3.attrs['TARGET_LAT']
37.0

Have you tried using the latest MAPL version used in 14.4? I wonder if there was a mistake if you manually applied the fix.

from gchp.

kilicomu avatar kilicomu commented on August 23, 2024

Hi @lizziel - thanks for trying to reproduce the error. I'm now very confused as to how the stretch parameters have become wrong in the restart file!

I'll chat with @amy1916 and see if I can reliably reproduce the problem on our end.

from gchp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.