Giter Club home page Giter Club logo

geoschem / gcpy Goto Github PK

View Code? Open in Web Editor NEW
49.0 9.0 24.0 8.97 MB

Python toolkit for GEOS-Chem. Contains basic plotting scripts, plus the suite of GEOS-Chem benchmarking utilities.

Home Page: https://gcpy.readthedocs.io

License: Other

Python 40.97% Jupyter Notebook 58.97% Shell 0.05% Dockerfile 0.02%
python cloud-computing visualization-tools geos-chem atmospheric-modelling atmospheric-chemistry scientific-computing plotting-in-python benchmarking python-toolkit

gcpy's People

Contributors

dependabot[bot] avatar jiaweizhuang avatar jmoch1214 avatar jourdan-he avatar kilicomu avatar laestrada avatar liambindle avatar lizziel avatar msulprizio avatar sdeastham avatar williamdowns avatar yantosca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gcpy's Issues

[BUG/ISSUE] Allow comparisons of CS output with different resolutions

When comparing CS output at two different resolution, GCPy will crash with "ERROR: CS to CS regridding is not yet implemented in gcpy. Ref and dev cubed sphere grids must be the same resolution, or pass cmpres to compare_single_level as a lat-lon grid resolution." To get around this, we will set the default comparison grid to 1x1.25 lat-lon when comparing two different CS resolutions.

[BUG/ISSUE] Benchmark plotting functions fail if cmpres passed as cubed sphere

The comparison resolution argument cmpres that is passed to benchmark.py function compare_single_level is supposed to be able to be either cubed sphere or lat-lon. However, if it is cubed sphere then the function fails due to a type error. This is because cmpres is passed as string but the function expects it to be integer if cubed sphere in certain places. All cubed sphere comparisons to date have not passed cmpres explicitly, and hence this bug has not been previously detected. The quick fix is to cast cmpres to integer if cubed sphere and passed explicitly.

A related issue occurrs in benchmark.py function compare_zonal_mean. If cmpres is passed as cubed sphere then the function fails with a type error. For zonal mean the comparison resolution should never be cubed sphere but this is not enforced. The quick fix is to override the passed cubed sphere resolution with lat-lon '1x1.25'.

[BUG/ISSUE] Zeros shown as blue in cubed sphere comparison plots

When using gcpy benchmark function compare_single_level with cubed sphere output, 5 of 6 faces show up as blue. The southern pole face is white, as expected. This only impacts the difference plots and the fractional difference plot with dynamic range. The fractional difference plot with fixed range is fine. Plots on lat-lon grids are also fine.

Screen Shot 2019-08-15 at 1 02 48 PM

[FEATURE REQUEST] Global mass table

The GCSC has requested global mass tables for all species as part of the benchmark output. These values can be computed from the restart file output at the end of the simulation.

[BUG/ISSUE] Striping in fractional diff plots comparing cubed sphere output

When using gcpy function compare_single_level with cubed sphere output there appears to be horizontal striping in the fractional difference plot (lower left subplot). The striping does not appear in any of the other subplots. This seems like it may be an issue with either the calculation of the fractional difference or how python is plotting cubed sphere. The horizontal striping appears to only affect some of the faces, and the boundary between the striping and what looks like correct data is the vertical centerline of a face.

gcpy_cs_fracdiff_striping

[FEATURE REQUEST] Use multi-threading to speed up benchmark plotting

Creating GEOS-Chem output benchmark plots using the run_1mo_benchmark.py script (which calls routines in benchmark.py) can take an extremely long time. This is especially true for GCHP benchmark plots, which require regridding from cube-sphere to lon/lat.

It would be nice to use Python's multi-threading capabilities to e.g. create all of the PDF files for each category in parallel. This could drastically reduce the amount of time needed to generate benchmark plots.

[QUESTION] Trouble running gcpy

I ran into some trouble running gcpy on my Mac using Anaconda python:

  1. I created the gcpy conda environment using the gcpy environment.yml file

  2. I installed the gcpy directory using the command:
    git clone https://github.com/geoschem/gcpy.git gcpy
    in my directory '/Users/psk9/Desktop/GC'

I then ran my python gcpy analysis script from the directory '/Users/psk9/Desktop/GC'

The first problem I ran into was with the 'import xbpch' command:
Traceback (most recent call last): File "../quick_plots.py", line 29, in <module> import xbpch File "/anaconda3/envs/gcpy/lib/python3.6/site-packages/xbpch/__init__.py", line 7, in <module> from . bpch import BPCHFile File "/anaconda3/envs/gcpy/lib/python3.6/site-packages/xbpch/bpch.py", line 11, in <module> from xarray.core.pycompat import OrderedDict ImportError: cannot import name 'OrderedDict'

I fixed this by downgrading to xarray 11.0.3 with the command:
conda install xarray=0.11.3
in the gcpy environment

The next problem i ran into was the following:

Traceback (most recent call last): File "quick_plots.py", line 236, in <module> gcpy.compare_single_level( refdata, refstr, devdata, devstr, ilev=0, varlist=varlist) AttributeError: module 'gcpy' has no attribute 'compare_single_level'

To avoid this error and run successfully I had to set the PYTHON path variable
in my .bash_profile as follows:
export PYTHONPATH=":/Users/psk9/Desktop/GC/gcpy"

[FEATURE REQUEST] Add code to generate plots & tables for 1-year (fullchem) benchmark simulations

Overview

As of this writing, the following plots & tables for the 1-year (fullchem) benchmarks are currently generated by IDL code. We would to add this functionality in Python so that we can retire IDL. As this work progresses we will update this issue in the comments section below.

Already implemented in GCPy:

Plots:

  1. Species concentration by category
  2. Emissions by category
  3. J-values
  4. Column AOD

Tables:

  1. Emission totals table for 1-year benchmark simulations
  2. Inventory totals table for 1-year benchmark simulations
  3. Global masses at end of simulation

Still generated by IDL (to be eventually added into GCPy):

Plots

  1. Model vs. Observations
  2. Aerosol vertical profiles

Tables

  1. Aerosol burdens
  2. Global mean AOD
  3. O3 STE (vertical flux across 100hPa)

Related

Please see this issue about work that is being done to migrate the code for plotting results from the 1-year TransportTracers simulation from IDL to Python: #40

[BUG/ISSUE] xbpch cannot open ND64 bpch diagnostic data

I am posting this issue on behalf of Jonathan Moch.

Describe the bug

I've been trying to implement the UVflux (ND-64) diagnostics as netcdf and have had some success (the values are not all zero!), but am having trouble now doing the comparison with the .bpch diagnostic.

When I put the .bpch output and relavant .dat files into the xpbch.open_bpchdataset function in python I keep getting a "KeyError."

I'm currently trying to rerun the test simulation with just the ND64 output to see if that helps, but if you have any suggestions on what to do I would appreciate it. The relevant output is at: /n/scratchlfs/jacob_lab/jmoch/geoE_output/geosfp_4x5_uvflxtest/OutputDir/run1/

And the full error message, which is a bit opaque to me, is below. It looks like maybe there is a missing name of the diagnostic? Was the UVFlux diagnostic not set up to be recognized in xpbch?

Error message

KeyError                                  Traceback (most recent call last)
/n/jacob_lab/Lab/python/miniconda/envs/jacob/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2889             try:
-> 2890                 return self._engine.get_loc(key)
   2891             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'name'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-8-b267f079dc48> in <module>
      5     reftracerinfo = os.path.join(refdir,'tracerinfo.dat')
      6     refdiaginfo = os.path.join(refdir,'diaginfo.dat')
----> 7     refdata = xbpch.open_bpchdataset(refbpfile, tracerinfo_file=reftracerinfo, diaginfo_file=refdiaginfo)
      8 
      9 # Dev

/n/jacob_lab/Lab/python/miniconda/envs/jacob/lib/python3.6/site-packages/xbpch/core.py in open_bpchdataset(filename, fields, categories, tracerinfo_file, diaginfo_file, endian, decode_cf, memmap, dask, return_store)
     79         tracerinfo_file=tracerinfo_file,
     80         diaginfo_file=diaginfo_file, endian=endian,
---> 81         use_mmap=memmap, dask_delayed=dask
     82     )
     83     ds = xr.Dataset.load_store(store)

/n/jacob_lab/Lab/python/miniconda/envs/jacob/lib/python3.6/site-packages/xbpch/core.py in __init__(self, filename, fields, categories, fix_cf, mode, endian, diaginfo_file, tracerinfo_file, use_mmap, dask_delayed)
    278 
    279         # Parse the binary file and prepare to add variables to the DataStore
--> 280         self._bpch._read_var_data()
    281 
    282         # Create storage dicts for variables and attributes, to be used later

/n/jacob_lab/Lab/python/miniconda/envs/jacob/lib/python3.6/site-packages/xbpch/bpch.py in _read_var_data(self)
    312             var_attr['unit'] = unit
    313 
--> 314             vname = diag['name']
    315             fullname = category_name.strip() + "_" + vname
    316 

/n/jacob_lab/Lab/python/miniconda/envs/jacob/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2973             if self.columns.nlevels > 1:
   2974                 return self._getitem_multilevel(key)
-> 2975             indexer = self.columns.get_loc(key)
   2976             if is_integer(indexer):
   2977                 indexer = [indexer]

/n/jacob_lab/Lab/python/miniconda/envs/jacob/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2890                 return self._engine.get_loc(key)
   2891             except KeyError:
-> 2892                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2893         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2894         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'name'

[BUG/ISSUE] compare_zonal_mean flip_ref and flip_dev not working as expected

When comparing GCHP internal state output (checkpoint file) I use flip_ref and flip_dev to flip the vertical axis for zonal mean plots using compare_zonal_mean. It appears that it has no effect, however, and the plots show concentrations that are incorrect with respect to the y-axis.

For example, here is the correctly plotted SpeciesConc collection zonal mean for H2O, which does not require flipping.

SpeciesConc_H2O_ZM

In contrast, here is the internal restart file zonal mean for H2O:

SPC_H2O_ZM

To reproduce the issue, use compare_zonal_mean function with GCHP checkpoint files (gcchem*) and pass arguments flip_ref=True and flip_dev=True in one case, and default values (False) in another. If the resulting plots are the same then flip_ref and flip_dev do not work as expected.

Expected behavior: passing flip_ref and flip_dev as True to compare_zonal_mean should result in vertically flipping relative to not passing them. The zonal mean should resemble the SpeciesConc collection zonal mean.

This issue has gone under the radar since we do not benchmark GCHP restart files and the only History collection that is vertically flipped is emissions which we do not generate zonal mean plots for.

[BUG/ISSUE] Issue in gcpy.compare_single_level: Top title is not printed

I was using gcpy.compare_single_level to compare output between GCC and GCHP runs. The data variable I am using is Met_MODISLAI (which is only xy dimension). The data variables are shown here:

Ref (GCC)
<xarray.Dataset>
Dimensions:       (lat: 91, lon: 144, time: 1)
Coordinates:
  * time          (time) datetime64[ns] 2016-01-15T01:00:00
  * lon           (lon) float64 -180.0 -177.5 -175.0 ... 172.5 175.0 177.5
  * lat           (lat) float64 -89.5 -88.0 -86.0 -84.0 ... 84.0 86.0 88.0 89.5
Data variables:
    Met_MODISLAI  (time, lat, lon) float32 ...

Dev
<xarray.Dataset>
Dimensions:       (lat: 288, lon: 48, time: 1)
Coordinates:
  * lon           (lon) float64 1.0 2.0 3.0 4.0 5.0 ... 44.0 45.0 46.0 47.0 48.0
  * lat           (lat) float64 1.0 2.0 3.0 4.0 5.0 ... 285.0 286.0 287.0 288.0
  * time          (time) datetime64[ns] 2016-01-15T01:00:00
Data variables:
    Met_MODISLAI  (time, lat, lon) float32 ...

And my calling sequence is:

#!/usr/bin/env python

# Imports
import os
import xarray as xr
import gcpy

# Ref dataset
refstr  = '12.3.0 GCC'
refdir  = '~/LT/12.3.0/geosfp_2x25_TransportTracers/OutputDir'
reffile = 'GEOSChem.StateMet_inst.20160115_0100z.nc4'
refpath = os.path.join(refdir, reffile)
refdata = xr.open_dataset(refpath)

# Dev dataset
devstr  = '12.3.0 GCHP'
devdir  = '~/SP/if17/gchp_TransportTracers/OutputDir'
devfile = 'GCHP.StateMet_inst.20160115_0100z.nc4'
devpath = os.path.join(devdir, devfile)
devdata = xr.open_dataset(devpath)

varlist = ['Met_MODISLAI']

# Variables to plot

# PDF filename
pdfname = '/n/scratchlfs/jacob_lab/ryantosca/GC/rundirs/plots/pdf/MODIS_LAI_surface_nc_diag.pdf'

# Compare data
gcpy.compare_single_level(refdata, refstr, devdata, devstr,
                          varlist=varlist, pdfname=pdfname)

It produces the PDF, but for some reason the top title is not printed. The echoback to stdout is:

Reuse existing file: ./conservative_2x2.5_1x1.25.nc
/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/ ib/python3.6/site-packages/xesmf/backend.py:36: UserWarning: Input array is not F_CONTIGUUS. Will affect performance.
  warnings.warn("Input array is not F_CONTIGUOUS. "
Reuse existing file: ./conservative_c48_1x1.25_0.nc
Reuse existing file: ./conservative_c48_1x1.25_1.nc
Reuse existing file: ./conservative_c48_1x1.25_2.nc
Reuse existing file: ./conservative_c48_1x1.25_3.nc
Reuse existing file: ./conservative_c48_1x1.25_4.nc
Reuse existing file: ./conservative_c48_1x1.25_5.nc

Creating /n/scratchlfs/jacob_lab/ryantosca/GC/rundirs/plots/pdf/MODIS_LAI_surface_nc_diagpdf for 1 variables
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-ryantosca'
/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/ib/python3.6/site-packages/xarray/core/dataarray.py:507: FutureWarning: elementwise compaison failed; returning scalar instead, but in the future will perform elementwise comparion
  return key in self.data
0 Incorrect dimensions for Met_MODISLAI!

See attached file below.

I believe that the top title is computed in this block:

        if 'lev' in refdata[varname].dims and 'lev' in devdata[varname].dims:
            if ilev == 0: levstr = 'Surface'
            elif ilev == 22: levstr = '500 hPa'
            else: levstr = 'Level ' +  str(ilev-1)
            figs.suptitle('{}, {}'.format(varname,levstr), fontsize=fontsize, y=offset)
        elif 'lat' in refdata[varname].dims and 'lat' in devdata[varname] and 'lon' in refdata[varname].dims and 'lon' in devdata[varname]: 
            figs.suptitle('{}'.format(varname), fontsize=fontsize, y=offset)
        else:
            print('Incorrect dimensions for {}!'.format(varname))   

So I am not sure what is going on here.

no_toptitle

[DISCUSSION] Defining the scope & use cases of GCPy in README

Motivation

In README or somewhere there should be a list/table of what GCPy is intended to do (as well as NOT indented to do). One apparent question a new user will have is "in what cases should I download & learn GCPy?". Having a clear answer to that will bring many benefits:

  1. By narrowing down the scope of GCPy, we can out-source many common tasks to the upstream xarray/numpy/scipy. This avoids reinventing the wheel and reduces development burden.
  2. We can worry less about GCPy's API stability and backward compatibility. A major issue for new packages like GCPy is that the APIs can change frequently from version to version (often for good reasons like better efficiency or more functionalities). On the other hand, upstream packages have much more stable APIs. By having a small number of APIs, users will feel less disturbed by version updates.
  3. Some general-purpose functionalities can be pushed upstream to xarray so they can sustain longer and benefit a broader community. There are a lot of model-specific packages (for example NCAR's esmlab https://github.com/NCAR/esmlab), but I feel that many functionalities are duplicating. For example, the code handling cubed-sphere might be easily adapted for other multi-tile grids -- this is a very common need in Earth science models (e.g. xgcm also has a notion of multi-tile grid)

Examples

Some apparent GCPy use cases:

  1. Generate the standard benchmark plots
  2. Obtain GEOS-Chem's horizontal/vertical grid information
  3. GCHP-specific regridding functionalities

Some apparent non-GCPy use cases:

  1. Modify a NetCDF file (crop a domain, extract some variables)
  2. Simple 2D plotting on lat-lon grid, or 1D plotting against time
  3. All statistical analysis tools (should be done in SciPy/sklearn)

[QUESTION] Trouble setting up gcpy on Mac

Hi everyone,
I'm trying to setup gcpy on my Mac. I have anaconda3 installed. I downloaded gcpy from git and put it under my Desktop. I used this command to create the environment:
conda env create -vv -n gcpy gcpy/docs/environment.yml.
However, I got the error below:
Screen Shot 2019-05-16 at 4 21 19 PM
I was wondering if this is the right way to set up gcpy?
Does anybody have similar issues?

Jun Meng

[BUG/ISSUE] Display of all zeros in colorbar

We have had two different ways of conveying that all displayed values are zero. One is to show multiple zeroes spanning the entire colorbar.
gcpy_multiple_zeroes
Another is to show a single zero in the middle with the bounds of the colorbar unlabeled (current).
gcpy_single_zero
It would be good to get some feedback on preference. I prefer multiple zeroes since then there is no doubt about the range, but others may disagree.

[FEATURE REQUEST] CS to CS regridding in gcpy

When comparing cubed sphere data sets at different resolutions using compare_single_level I ran into this error:

Traceback (most recent call last):
  File "/n/home08/elundgren/bin/compare_var.py", line 20, in <module>
    gcpy.compare_single_level(ds1, desc1, ds2, desc2, varlist=varlist, weightsdir=weightsdir)
  File "/n/home08/elundgren/GC/python/gcpy/gcpy/benchmark.py", line 294, in compare_single_level
    refregridder_list = make_regridder_C2L(refres, cmpres, weightsdir=weightsdir, reuse_weights=True)
  File "/n/home08/elundgren/GC/python/gcpy/gcpy/grid/regrid.py", line 16, in make_regridder_C2L
    llgrid = make_grid_LL(llres_out)
  File "/n/home08/elundgren/GC/python/gcpy/gcpy/grid/horiz.py", line 12, in make_grid_LL
    [dlat,dlon] = list(map(float, llres.split('x')))
AttributeError: 'int' object has no attribute 'split'

I traced this to the logic in compare_single_level that determines what regridder to make. The logic is written such that if ref or dev is CS, and the comparison grid is different from the native, a CS to lat-lon regridder is created. However, if both ref and dev are cubed sphere the comparison grid is also cubed sphere, so the CS to lat-lon regridder will not work. Unless argument cmpres is passed such that it is explicitly set as lat-lon, e.g. cmpres='1x1'.

In summary there are two problems: (1) the CS to CS regridding is missing from gcpy (make_regridder_C2C does not exist), and (2) the logic that would call for CS to CS regridding is missing and instead a CS to lat-lon regridder is used, resulting in an error.

To address the second problem I pushed an update to return with an explanatory error message if CS to CS regridding is called for. See 4a2094d.

ERROR: CS to CS regridding is not yet implemented in gcpy. Ref and dev cubed sphere grids must be the same resolution, or pass cmpres to compare_single_level as a lat-lon grid resolution.

However, this should not be necessary. CS to CS regridding needs to be implemented into gcpy.

[QUESTION] gcbm does not have plot_layer

Right now I keep encountering an attribution error that gcpy.benchmark (gcbm) does not have plot_layer. I'm working through the first tutorial of benchmark_design.ipynb--does gcbm no longer have the plotting function?

[QUESTION] compare_zonal_mean not reading in Xarray.Dataset files?

For fields where I've used gcpy.open_mfdataset to open a list of files and then .mean(dim='time',keep_attrs=True) to get an annual average, gcpy.compare_zonal_mean is rejecting the fields saying they are not an axarray DataArray, even though the fields are definitely Xarray Datasets and gcpy.compare_single_level successfully loads the same exact files and makes figures with no problem.

The specific error I am getting is:


TypeError Traceback (most recent call last)
in
1 # Zonal mean - PDF
2 pdfname = os.path.join(plotsdir,'ZonalMean_{}_{}diag.pdf'.format(desc, comparison))
----> 3 gcpy.compare_zonal_mean( refdata, refstr, devdata, devstr, varlist=varlist, pdfname=pdfname )
4 #gcpy.add_bookmarks_to_pdf( pdfname, varlist, remove_prefix='SpeciesConc
')

~/Py_notebooks/gcpy/gcpy/benchmark.py in compare_zonal_mean(refdata, refstr, devdata, devstr, varlist, itime, weightsdir, pdfname, cmpres, match_cbar, pres_range, normalize_by_area, enforce_units, flip_ref, flip_dev, use_cmap_RdBu)
1050 for i in range(6):
1051 regridder = refregridder_list[i]
-> 1052 ds_ref_cmp += regridder(ds_ref_reshaped[i])
1053 else:
1054 ds_ref_cmp = ds_ref

/n/jacob_lab/Lab/python/miniconda/envs/gcpy/lib/python3.6/site-packages/xesmf/frontend.py in call(self, a)
260 regrid_func = self.regrid_dataarray
261 else:
--> 262 raise TypeError("input must be numpy array or xarray DataArray!")
263
264 return regrid_func(a)

TypeError: input must be numpy array or xarray DataArray!

[BUG/ISSUE] Fixed issues when reading GCHP files with MAPL v1.0.0

Description

Data variables from GCHP files created with MAPL v1.0.0 have 5 dimensions (time, lev, nf, Ydim, Xdim) -- where nf is the number of faces and Xdim = Ydim = number of points along the cube face.
This caused several issues with the prior GCPy code, which was developed to work with GCHP output from the prior version of MAPL. These output files contained 4 dimensions: (time, lev, 6*Ydim, Xdim).

In commit c7b1177, we have added several fixes in for these issues. We also skip reading a few data variables ("anchor", "ncontact", "orientation") from GCHP output created with MAPL v1.0.0, so as to avoid issues with GCPy and xarray.

[BUG/ISSUE] Benchmark plots omit comparisons unless species/emissions are found in both Ref and Dev

When creating the benchmark plots, species and/or emissions are only plotted if they found in both Ref and Dev datasets.

For example if a species is present in Ref, but removed from Dev, then the benchmark code will totally skip that comparison. This has caused some confusion among GCSC members who are examining the benchmark results.

What we should do is to set the missing species to zero and then force the comparison.

[QUESTION] Changing y axis on gcpy.compare_zonal_mean

For the gcpy.compare_zonal_mean the y axis is hPa, which ends up being an issue if you want to look at stratospheric changes and also gives you a distorted view of the troposphere. Is there a way to change the y-axis to log(pressure) or to km? That would give a better vertical profile where thing's are so compressed for the upper troposphere and above.

[FEATURE REQUEST] Benchmark plot change request from Stratospheric Working Group

Susan Strahan, Co-Chair of GEOS-Chem Stratospheric Working Group writes:
"Dylan, Seb, and I are happy with those plots EXCEPT they’ve got to be in log-pressure and run from about 100 to 1 hPa. So this is the only change to benchmarks we are recommending at this time. Only the plots made for the stratosphere need to change."

[BUG/ISSUE] run_1mo_benchmark.py dies with an error if we restrict the types of plots that are created

Describe the bug
If we restrict the types of benchmark plots that are created (which we often do for testing purposes), then the run_1mo_benchmark.py script will die with an error:

  File "./run_1mo_benchmark.py", line 267, in <module>
     sigdiff_files=gcc_vs_gcc_sigdiff)
  File "/n/home09/ryantosca/GC/python/gcpy/gcpy/benchmark.py", line 3098, in make_benchmark_conc_plots
    for v in diff_sfc:
UnboundLocalError: local variable 'diff_sfc' referenced before assignment

To Reproduce
Add the plots=['zonalmean'] keyword to make_benchmark_conc_plots:

if gcc_vs_gcc:

    if plot_conc:
        # Concentration plots
        # (includes lumped species and separates by category)
        print('%%% Creating GCC vs. GCC concentration plots %%%')
        bmk.make_benchmark_conc_plots(gcc_vs_gcc_refspc,
                                      gcc_vs_gcc_refstr,
                                      gcc_vs_gcc_devspc,
                                      gcc_vs_gcc_devstr,
                                      dst=gcc_vs_gcc_plotsdir,
                                      overwrite=True,
                                      restrict_cats=['Oxidants'],
                                      plots=['zonalmean'],
                                      sigdiff_files=gcc_vs_gcc_sigdiff)

Expected behavior
The code should create all benchmark plots and print the list of significant differences.

Potential solution
This could be easily fixed by not trying to update the list of significant differences for e.g. surface plots when we are only creating zonal mean plots. A few if statements can resolve this.

[BUG/ISSUE] Zonal mean plots with pressure subset are incorrect when y-axis flipped

Using the pressure subsetting option in compare_zonal_mean works correctly when the flip_ref and/or flip_dev are set to the default value (False). However, it seems that the plots are not correct if the vertical axis is flipped.

For example, here are concentration plots for Br2 mean full column (no subsetting) with the vertical axis of the Dev data flipped. This is comparing a GCC restart file with a GCHP restart file, so this flipping is necessary.

Br2_rst_zonal_log_partial

To hone in on the stratosphere I use the pressure subsetting for interval [1,100]. This is what is plotted:
Br2_rst_zonalstrat_log_partial

I suspect the issue is the order of the subsetting and the flipping. I will investigate.

[BUG/ISSUE] gcpy.open_mfdataset won't open .nc4 data

gcpy.open_mfdataset won't open .nc4 data. Instead it says it needs ".nc"

I think this is a bug since the default GCHP output is now .nc4.

The errors says: "ValueError: Found unknown file extension (.nc4); please pass a BPCH or netCDF file with extension 'bpch' or 'nc'."

[FEATURE REQUEST] Return dictionary from core.compare_varnames

Current core.compare_varnames outputs 6 lists:

    Returns:
    --------
        commonvars: list of strs
            Variables that are common to both refdata and devdata,
            regardless of dimension.

        commonvarsOther: list of strs
            Variables that are common to refdata and devdata,
            and that do not have lat, lon, and level dimensions.

        commonvars2D: list of strs
            Variables that are common to refdata and devdata,
            and that have lat and lon dimensions, but not level.

        commonvars3D: list of strs
            Variables that are common to refdata and devdata,
            and that have lat, lon, and level dimensions.

        refonly: list of strs
            Plottable variables (i.e. 2D or 3D) that are only
            present in the Ref dataset.

        devonly: list of strs
            Plottable variables (i.e. 2D or 3D) that are only
            present in the Dev dataset.

This can be difficult to use because you need to know exactly what each one is. Changing the output will also break existing code (just happened to me with the latest update).

It would be great if the output was instead a single dictionary with six key and value pairs (key = string description, value = list of names). This would allow expanding the output without breaking existing code and also make the output easier to understand and use.

[FEATURE REQUEST] Add species class for GCPy to use generally?

I would be interested in whether GCST/GCPy users thought it would good to have an all-encompassing species class to used and accessible through GCPy?

There is one of these in AC_tools (class species in AC_tools/variables.py), which allows for you access various variables (such as the subset below). These species objects can then be passed to functions and when properties are needed (e.g. what mechanism a species is used within) can be retrieved.

specs2use = ['O3', 'ALD2', 'ACET']                                                                                                          
spec_objs = [AC.species(i) for i in specs2use]                                                                                              
prt_str = 'Name = {:<5}, RMM: {:.2f}, Drydep: {}, Wetdep: {}, Formula: {}, Ox: {}, LaTeX: {}'    
for spec in spec_objs: 
    print( prt_str.format( spec.name, spec.RMM, spec.Drydep, spec.Wetdep )  )     
                                                                                                                                         
Name = O3   , RMM: 48.00, Drydep: True, Wetdep: False, Formula: O3, Ox: 1.0, LaTeX: O$_{3}$
Name = ALD2 , RMM: 44.05, Drydep: True, Wetdep: True, Formula: CH3CHO, Ox: nan, LaTeX: Acetaldehyde
Name = ACET , RMM: 58.08, Drydep: True, Wetdep: False, Formula: CH3C(O)CH3, Ox: nan, LaTeX: CH$_{3}$C(O)CH$_{3}$

The properties are currently just generated from the GEOS-Chem wiki page on species.

I saw the related commit on the GEOS-Chem repository and was thinking about updating to use this file when it started being outputted.

[FEATURE REQUEST] Add percent change to global mass tables

Daniel Jacob has requested that we add a percent change to the benchmark global mass table output. This will involve a little bit of reformatting in such a way the we change the global mass printout but not the emissions totals printout.

[FEATURE REQUEST] Add support for printing emissions totals from 1-year benchmarks

Issue
Right now the benchmark emissions table is geared towards printing emissions totals from the 1-month benchmark output. We will need to extend this to print annual emissions from the 1-year benchmark simulations.

Potential solution
We can probably modify the existing code to read in a year's worth of emissions (by concatenating all months of data into the Ref and Dev dataset objects), and then sum those into annual totals.

[BUG/ISSUE] Colorbar x-axis labels run into each other

The condition for switching from auto-binning of x-axis labels to forcing 4 bins in function compare_single_level is not adequate for ensuring the axis is readable. For example, the x-axis labels on this plot are very close together making it hard to read.

Screen Shot 2019-10-31 at 11 03 54 AM

This is happening because the condition to switch to 4 bins is bound difference less than 0.001. However, this only catches a subset of instances where values have three digits beyond the decimal point.

[BUG/ISSUE] Global mass tables not recognizing new species as missing in reference version

The global mass tables in GCPy are not properly recognizing when species are missing from a version. For example, in comparisons of dev/12.7.0 with GEOS-Chem 12.6.0, new species ETNO3, MENO3, IPRNO3, and NPRNO3 are all listed with the values of the previous species in the Ref column when they should be NaN.

Here is the original message from Jenny Fisher:

I don’t think the global mass tables are pointing to the right Ref version as they show values for all the new species in the 12.6.0 version.

Here is the problematic text in the global mass tables:

###############################################################################
### Global mass (Gg) at end of simulation (Trop only)                       ###
### Ref = GC_12.6.0; Dev = GC_12.7.0_SmallAlkylNitrateChem                  ###
###############################################################################
                              Ref                 Dev      Dev - Ref    % diff
ETHLN        :           0.396240            0.395432      -0.000808    -0.204
ETNO3        :           0.396240           19.799126      19.402885  4896.751
...
IPMN         :          97.114059           96.601143      -0.512917    -0.528
IPRNO3       :          97.114059           26.614943     -70.499115   -72.594
...
MEK          :        2037.818237         2031.013916      -6.804321    -0.334
MENO3        :        2037.818237           48.085583   -1989.732666   -97.640
...
NPMN         :           0.003098            0.003046      -0.000051    -1.661
NPRNO3       :           0.003098            6.418084       6.414987  207093.115
...

[DISCUSSION] To-do list for distributing GCPy as a package

We will need to do the following before we can ship GCPy as a Python package:

  1. Implement semantic versioning (i.e. version numbers X.Y.Z)
  2. Make sure all *.py files are PEP-8 compliant by running pycodestyle on them
  3. Add more examples & scripts -- solicit input from community
  4. Expand the existing unit tests, and implement continuous integration on TravisCI
  5. Add documentation on readthedocs -- right now we need to get ownership of that back from D. Rothenberg.
  6. Push gcpy to conda or pip

Any others?

[BUG/ISSUE] Restricted range differences incorrectly all zero in some emissions column sum plots

In some emissions column sum benchmark plots that compare cubed sphere data, the restricted difference plots (middle row, right column) show all zeros when the dynamic range difference subplot and fractional difference subplot show non-zeros differences. This appears to be a bug in gcpy. The colorbar range is [-0.1,0.1], which usually indicates NaN in the array trying to be plotted. This needs further investigation.

gcpy_emis_diff_plot_issue

[BUG/ISSUE] xarray open_mfdataset cannot open multiple GCHP diagnostic/restart files created with MAPL v1.0.0

Describe the bug
The xarray open_mfdataset function dies with an error when trying to read more than one file created by GCHP using the new MAPL v1.0.0 (i.e. in GCHP 12.5.0 and later versions).

To Reproduce

import xarray as xr
filelist = [ '/path/to/GCHP/file1.nc', '/path/to/GCHP/file2.nc']
ds = xr.open_mfdataset(filelist)

Expected behavior
The returned dataset should contain the merged data from e.g. file1.nc and file2.nc.

Screenshots
Instead, this error occurs:

  File "./run_1mo_benchmark.py", line 479, in <module>
    ds = xr.open_mfdataset([gchp_vs_gchp_refspc, gchp_vs_gchp_refaod])
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/backends/api.py", line 719, in open_mfdataset
    ids=ids)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 553, in _auto_combine
    data_vars=data_vars, coords=coords)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 475, in _combine_nd
    compat=compat)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 493, in _auto_combine_all_along_first_dim
    data_vars, coords)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/combine.py", line 514, in _auto_combine_1d
    merged = merge(concatenated, compat=compat)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 532, in merge
    variables, coord_names, dims = merge_core(dict_like_objects, compat, join)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 451, in merge_core
    variables = merge_variables(expanded, priority_vars, compat=compat)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 170, in merge_variables
    merged[name] = unique_variable(name, var_list, compat)
  File "/net/seasasfs02/srv/export/seasasfs02/share_root/ryantosca/python/geo/miniconda/envs/geo/lib/python3.6/site-packages/xarray/core/merge.py", line 90, in unique_variable
    % (name, out, var))
xarray.core.merge.MergeError: conflicting values for variable 'anchor' on objects to be combined:
first value: <xarray.Variable (nf: 6, ncontact: 4)>
dask.array<shape=(6, 4, 4), dtype=int32, chunksize=(6, 4, 4)>
Attributes:
    long_name:  anchor point
second value: <xarray.Variable (nf: 6, ncontact: 4)>
dask.array<shape=(6, 4, 4), dtype=int32, chunksize=(6, 4, 4)>
Attributes:
    long_name:  anchor point

Required information:

  • OS: CentOS 7
  • python 3.6.9
  • xarray 0.12.1
  • netcdf-fortran 4.4.4
  • netcdf4 1.4.2
  • dask 2.3.0
  • dask-core 2.3.0
  • numpy 1.16.4
  • numpy-base 1.16.4
  • scipy 1.3.1

Additional context
The problem seems to be caused by a single variable called "anchor".

[BUG/ISSUE] Empty or unused directories

Directory gcpy/obs is effectively empty since it contains file init.py with no text. Some other directories contain files that are not used anywhere atm_sci. These could be cleaned up to focus GCPy on functions actively used and supported.

[BUG/ISSUE] Matplotlib function set_clim will be deprecated in Matplotlib 3.3

Using the benchmark plotting functions with the currently recommended package configuration gives the following warning about a Matplotlib function that will be deprecated in a future version:

The set_clim function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use ScalarMappable.set_clim instead. [benchmark.py:1714]

This is not a problem right now but could become a problem in the future when Matplotlib is updated. We should switch to the recommended function before Matplotlib releases 3.3.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.