geoschem / geoschem-python-tutorial Goto Github PK
View Code? Open in Web Editor NEWPython/xarray tutorial for GEOS-Chem users
License: Other
Python/xarray tutorial for GEOS-Chem users
License: Other
Hi JiaweiZhuang,
You mentioned some tutorials about Linux command line, Git, Makefile, and Python.
How about LaTex for an atmospheric scientist?
Thank you!
Hello Jiawei,
I was trying to adapt your script for converting monthly trac_avg bpch files to netCDF, but the xbpch.open_bpchdataset won't open the file and just gives errors even with the proper diaginfo.dat and tracerinfo.dat?
Is there a better way to convert the montly trac_avg bpch files to netCDF?
trac_avg bpch to nc error.txt
I have found that the new xesmf>=0.2.0 can regrid the whole dataset directly. However, for those dataset including variables that do not latlon dims, the program will complain. An example is the GEOS-Chem Restartfile that contains variables like hyam, hybm, hyai, hybi, P0
. Therefore, I just wonder would it be great to let the xesmf detect those variables automatically and keep them the regridded file? Meanwhile, print some reminders/warnings that there are some variables do not support regridding?
Not clear about the internal achievement of regrid a dataset. If using a loop, then I suppose the following codes might be useful?
# supposing ds is the dataset that we want to regrid
result_list = [regridder_bilinear(dr) for varname, dr in ds.items() if 'lon' in dr.dims and 'lat' in dr.dims]
ds_result = xr.merge(result_list)
Moreover, incorporating ds_result['hyam'] = ds['hybm']
and the like?
The xesmf complains MemoryError when I used it to regrid GC outputs from a nested simulation.
Sample data can be copied from here.
import numpy as np
import xarray as xr
import xesmf as xe
grid_in = {'lon': np.linspace(70,140,225), 'lat': np.linspace(15,55,161)}
grid_out = {'lon': np.linspace(73,136,6301), 'lat': np.linspace(18,54,3601)}
%time regridder_bilinear = xe.Regridder(grid_in, grid_out, method='bilinear')
ds = xr.open_dataset('.../geosfp_025x03125_complexSOA_SVPOA_ch/OutputDir/GEOSChem.Aerosols.20140101_0000z.nc4')
dr_result = regridder_bilinear(ds['AODHyg550nm_SALC'])
Then I got the following error:
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-38-5782543d60d6> in <module>
1 ds = xr.open_dataset('/scratch/local/feiyao/GC/nc/geosfp_025x03125_complexSOA_SVPOA_ch/OutputDir/GEOSChem.Aerosols.20140101_0000z.nc4')
----> 2 dr_result = regridder_bilinear(ds['AODHyg550nm_SALC'])
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xesmf/frontend.py in __call__(self, indata, keep_attrs)
296 return self.regrid_dask(indata)
297 elif isinstance(indata, xr.DataArray):
--> 298 return self.regrid_dataarray(indata, keep_attrs=keep_attrs)
299 elif isinstance(indata, xr.Dataset):
300 return self.regrid_dataset(indata, keep_attrs=keep_attrs)
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xesmf/frontend.py in regrid_dataarray(self, dr_in, keep_attrs)
347 temp_horiz_dims[1]: self.shape_out[1]
348 },
--> 349 keep_attrs=keep_attrs
350 )
351
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, *args)
967 join=join,
968 exclude_dims=exclude_dims,
--> 969 keep_attrs=keep_attrs)
970 elif any(isinstance(a, Variable) for a in args):
971 return variables_vfunc(*args)
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
216
217 data_vars = [getattr(a, 'variable', a) for a in args]
--> 218 result_var = func(*data_vars)
219
220 if signature.num_outputs > 1:
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, output_sizes, keep_attrs, *args)
563 raise ValueError('unknown setting for dask array handling in '
564 'apply_ufunc: {}'.format(dask))
--> 565 result_data = func(*input_data)
566
567 if signature.num_outputs == 1:
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xesmf/frontend.py in regrid_numpy(self, indata)
308
309 outdata = apply_weights(self.weights, indata,
--> 310 self.shape_in, self.shape_out)
311 return outdata
312
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/xesmf/smm.py in apply_weights(weights, indata, shape_in, shape_out)
88 # use flattened array for dot operation
89 indata_flat = indata.reshape(-1, shape_in[0]*shape_in[1])
---> 90 outdata_flat = weights.dot(indata_flat.T).T
91
92 # unflattened output array
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/scipy/sparse/base.py in dot(self, other)
361
362 """
--> 363 return self * other
364
365 def power(self, n, dtype=None):
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/scipy/sparse/base.py in __mul__(self, other)
470 return self._mul_vector(other.ravel()).reshape(M, 1)
471 elif other.ndim == 2 and other.shape[0] == N:
--> 472 return self._mul_multivector(other)
473
474 if isscalarlike(other):
/exports/csce/datastore/geos/users/s1855106/miniconda/base/envs/geo/lib/python3.6/site-packages/scipy/sparse/coo.py in _mul_multivector(self, other)
588 def _mul_multivector(self, other):
589 result = np.zeros((other.shape[1], self.shape[0]),
--> 590 dtype=upcast_char(self.dtype.char, other.dtype.char))
591 for i, col in enumerate(other.T):
592 coo_matvec(self.nnz, self.row, self.col, self.data, col, result[i])
MemoryError:
Since it complains about MemoryError, I thought it might due to that the file size is too large. Therefore, I tried to regrid some slice of the data and succeed.
dr_result = regridder_bilinear(ds['AODHyg550nm_SALC'].isel(time =0))
The dr_result is like the following
<xarray.DataArray 'AODHyg550nm_SALC' (lev: 47, lat: 3601, lon: 6301)>
array([[[9.448401e-04, 9.441025e-04, ..., 8.084830e-03, 8.086944e-03],
[9.406064e-04, 9.399024e-04, ..., 8.061062e-03, 8.063201e-03],
...,
[3.425571e-06, 3.426195e-06, ..., 4.186777e-06, 4.194596e-06],
[3.403327e-06, 3.403979e-06, ..., 4.174233e-06, 4.181704e-06]],
[[9.090279e-04, 9.079278e-04, ..., 8.137903e-03, 8.139824e-03],
[9.050539e-04, 9.039893e-04, ..., 8.113728e-03, 8.115649e-03],
...,
[3.533968e-06, 3.533472e-06, ..., 4.127377e-06, 4.136035e-06],
[3.510524e-06, 3.510023e-06, ..., 4.119212e-06, 4.127624e-06]],
...,
[[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00],
[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00],
...,
[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00],
[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00]],
[[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00],
[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00],
...,
[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00],
[0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00]]])
Coordinates:
time datetime64[ns] 2014-01-01T00:30:00
* lev (lev) float64 0.9925 0.9775 0.9625 ... 0.0004141 0.0001387 3.8e-05
* lon (lon) float64 73.0 73.01 73.02 73.03 ... 136.0 136.0 136.0 136.0
* lat (lat) float64 18.0 18.01 18.02 18.03 ... 53.97 53.98 53.99 54.0
Attributes:
regrid_method: bilinear
To solve the problem, I know I can do a inner loop to scan all the values within the dataset, but I just wonder if you have any better solutions?
Hi, I tried to regrid from GEOS-Chem model data (Res:4x5) to CrIS Satellite data (Res:0.5x0.625). And it seems worked perfectly as the spatial maps for original data and regridded data was same. Then I tried to do opposite which is regidded from CrIS Satellite data (Res:0.5x0.625) to GEOS-Chem model data (Res:4x5). In this case, I just changed the input and output under the xe.Regridder
function and plotted the spatial maps. However the maps generated from original and regridded data are not look same in this case. I am giving my code and figures below:
# Import required libraries
%matplotlib inline
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs #Coordinate reference system
import pandas as pd
import xesmf as xe
GC_path = '.../GC_Run_Dir/Isop_x1_cris_compare/OutputDir/GEOSChem.SatDiagn.20190701_0000z.nc4'
GC = xr.open_dataset(GC_path)
GC_ISOP = GC['SatDiagnConc_ISOP']
GC_ISOP
CrIS_ISOP_path = '.../CrIS_ISOP_data/201907_daily_CrIS_Isoprene_screened.nc'
CrIS_ISOP_ds = xr.open_dataset(CrIS_ISOP_path,decode_times=True)
CrIS_ISOP = CrIS_ISOP_ds['Isop']
# Need to get time as an xarray coordinate
# Bcz GC dim [time, lat, lon] but CrIS dim [day, lat, lon]
temp_staging_area = pd.to_datetime(CrIS_ISOP['day']+1,format='%d') + pd.DateOffset(hour=12) +pd.DateOffset(month=7) + pd.DateOffset(years=2019-1900)
CrIS_ISOP['day'] = temp_staging_area
## Rename CrIS coordinate of day to time to match with GEOS-Chem naming convention
CrIS_ISOP = CrIS_ISOP.rename({'day': 'time'})
CrIS_ISOP['time'].attrs["long_name"] = "Time"
CrIS_ISOP['time'].attrs["axis"] = "T"
CrIS_ISOP['time'].attrs
CrIS_ISOP['lat'].values
CrIS_ISOP #time: 31 lat: 361 lon: 576
# Setting up the CrIS grid (0.5x0.625)
#CrIS_ISOP['lat'] #[-89.875, -89.5, -89., ...,89., 89.5, 89.875] #361
#CrIS_ISOP['lon'] # [-180. , -179.375, -178.75 , ..., 178.125, 178.75 , 179.375] #576
CrIS_grid_with_bounds = {'lon': CrIS_ISOP['lon'].values,
'lat': CrIS_ISOP['lat'].values,
'lon_b': np.linspace(-180-0.625/2,179.375+0.625/2,577),
'lat_b':np.linspace(-89.875-0.5/2,89.875+0.5/2, 362).clip(-90, 90),
}
CrIS_grid_with_bounds
## Create Grid for GEOS-Chem 4x5 Run
#GC4x5['lat'] # [-89., -86., -82., -78.,..., 78., 82., 86., 89.] #46
#GC4x5['lon'] # [-180., -175., -170.,..., 165., 170., 175.] #72
GC_standalone_grid_with_bounds = {'lon': GC_ISOP['lon'],
'lat': GC_ISOP['lat'],
'lon_b': np.linspace(-180-5/2, 175+5/2,73),
'lat_b': np.linspace(-89-4/2, 89+4/2,47).clip(-90, 90),
}
GC_standalone_grid_with_bounds
regridder = xe.Regridder(CrIS_grid_with_bounds, GC_standalone_grid_with_bounds, 'conservative_normed', periodic=True)
regridder
Output:
xESMF Regridder
Regridding algorithm: conservative_normed
Weight filename: [conservative_normed_361x576_46x72.nc](http://conservative_normed_361x576_46x72.nc/)
Reuse pre-computed weights? False
Input grid shape: (361, 576)
Output grid shape: (46, 72)
Periodic in longitude? False
CrIS_ISOP_regridded = regridder(CrIS_ISOP, keep_attrs=True)
CrIS_ISOP_regridded.isel(time=0).plot()
CrIS_ISOP['Isop'].isel(time=0).plot()
Here, the first figure is for regridded data and the second one is for the original CrIS satellite data. But they are not same. But when I regidded from low to high resolution (4x5 to 0.5x0.625) in the same way, I got the exactly same figure for original and regridded data. What I am missing here then?
Hi there,
I am trying to follow the python tutorial to get some basic plots. However, I get traceback below. Also, my code is attached below just in case.
failed to get the current screen resources
Traceback (most recent call last):
File "panel_CO.py", line 47, in <module>
co.plot(ax=ax,cmap=WhGrYlRd)
File "/glade/u/home/lixujin/anaconda3/envs/geoschem/lib/python3.6/site-packages/xarray/plot/plot.py", line 552, in __call__
return plot(self._da, **kwargs)
File "/glade/u/home/lixujin/anaconda3/envs/geoschem/lib/python3.6/site-packages/xarray/plot/plot.py", line 187, in plot
return plotfunc(darray, **kwargs)
File "/glade/u/home/lixujin/anaconda3/envs/geoschem/lib/python3.6/site-packages/xarray/plot/plot.py", line 854, in newplotfunc
**kwargs)
File "/glade/u/home/lixujin/anaconda3/envs/geoschem/lib/python3.6/site-packages/xarray/plot/plot.py", line 1108, in pcolormesh
primitive = ax.pcolormesh(x, y, z, **kwargs)
File "/glade/u/home/lixujin/anaconda3/envs/geoschem/lib/python3.6/site-packages/cartopy/mpl/geoaxes.py", line 1451, in pcolormesh
result = self._pcolormesh_patched(*args, **kwargs)
File "/glade/u/home/lixujin/anaconda3/envs/geoschem/lib/python3.6/site-packages/cartopy/mpl/geoaxes.py", line 1469, in _pcolormesh_patched
if not self._hold:
AttributeError: 'GeoAxesSubplot' object has no attribute '_hold'
Code
# Read data and print out to see the structure
# Pay attention to unit conversion
# Author: Lixu Jin
# Version: GEOS-CHEM 12.2.1, 09/15/2019
# import packages
#from mpl_toolkits.basemap import Basemap
from gamap_colormap import WhGrYlRd
import cartopy.crs as ccrs
import xbpch
import matplotlib.pyplot as plt
import xarray as xr
import numpy as np
from matplotlib.colors import LogNorm
plt.style.use(['seaborn-talk', 'seaborn-ticks'])
#=============================================================
# Read single dataset
#=============================================================
# Read data
filedir = '/glade/u/home/lixujin/project/GEOS-CHEM/ExtData/HEMCO/GFAS/v2018-09/2018/'
data1 = 'GFAS_201807.nc'
data2 = 'GFAS_201808.nc'
data3 = 'GFAS_201809.nc'
ds1 = xr.open_dataset(filedir + data1)
ds2 = xr.open_dataset(filedir + data2)
ds3 = xr.open_dataset(filedir + data3)
#============================================================
# calculation part
#============================================================
# sum up druing we-can period
co = 60*60*24*1e3*(ds1['cofire'].sum('time') + ds2['cofire'].sum('time') + ds3['cofire'].sum('time'))
co.attrs['unit'] = 'ppm'
print(co)
print(type(co))
# fig setting
# empty map.
fig = plt.figure(figsize=(8,4))
# without this, it would be super wired.
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()
ax.set_title('global CO during wecan period')
co.plot(ax=ax,cmap=WhGrYlRd)
#fig.plot(cmap='gist_stern') # The ax keyword is particularly useful for subplots. Can be omitted here.
fig.savefig('co_global1.jpg')
Can you please provide python functions/code to convert level index to height or pressure for vertical plotting of Geos-Chem output.
I turned on prod/loss output in HISTORY.rc to output production rate of Ox. And I got the output file GEOSChem.ProdLoss.*.nc4 with the 3-D variable Prod_Ox and the unit is kg/s with really large numbers around 1e10. Is there something wrong with the unit?
List the steps to reproduce the bug or issue below:
...add your python code snippets here
A clear and concise description of what you expected to happen.
Run the following commands from an interactive Python session:
>>> import xarray as xr
>>> xr.show_versions()
This will display information about your system and the versions of Python packages that GEOSChem-python-tutorial depends on.
Provide any other context about the problem here.
I noticed that the files tracerinfo.dat and diaginfo.dat created by GEOS-Chem had been adjusted, which made old version bpch2nc conversion error. I want to know if there was a new version of the bpch2nc python module could solve this problem, and if it is possible send it to me [email protected].
Thank you so much.
Hello Jiawei,
Do you have an example code of conservative regridding where you mention in Chapter 3 Regridding "To more rigorously conserve mass, you would scale the mixing ratio field by air density before regridding, and then scale the regridding result back to mixing ratio."?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.