Comments (21)
@JiaweiZhuang, the in-memory weights branch is ready for testing. A Dockerfile for building ESMF from that branch is available here.
For reference, here are links to the docstrings for the new interfaces. ESMF doesn't build docs for branches...
In our testing, the factor arrays contain zero elements with no spatial overlap (disjoint geometries) and unmapped action is ignore. Again, sorry for the delay on this. Hopefully we won't encounter too many issues during your implementation.
from xesmf.
We have factors in the Python interface now and are working on testing. There are still some compiler-related things to address.
Before charging forward, I want to get some input on the interface. The factors themselves are allocated in Fortran, so we need to make sure their deallocation is handled smartly. With bigger grids, we want to minimize copies too. Nothing is fixed now, so open to any suggestions.
regrid = Regrid(..., factors=True) # factors is False by default
factor_list, factor_index_list = regrid.get_factors(deep_copy=True) # deep_copy is False by default
# Preferred deallocation method for factors but __del__ will also do this.
regrid.destroy()
I am thinking it makes sense to add an option for zero-based factor indexing. Maybe a start_index
option to Regrid
that defaults to 1
? It needs to happen in the Regrid
call so we can adjust the indexing in ESMF and not in the Python layer - again to avoid copies.
from xesmf.
This slows down the sparse MM by ~40%. See this demo. It can be fixed by copying and rearranging the memory layout, anyway.
This is a downer. I think what we'll need is a follow on issue where we adapt the regrid store interface in ESMF to return vectors for the index lists directly.
Looks good me if you don't mind adding & maintaining one more function.
Don't mind.
Even though the old ESMF.Regrid object is not referenced by any variable identifiers, the memory will not be freed (in contrast to the automatic garbage collection for normal numpy arrays).
It's hard to guarantee how the Python garbage collection will run. As usual, SO provides a good overview. This is why we introduced the destroy
method. The memory will be cleaned up eventually, it may just not happen when you really need it. destroy
will be called via __del__
/ the destructor on the ESMF.Regrid
object when gc runs.
In summary
Thanks for this.
My mistake -- field.get_area() is to compute the area and add to field.data, not directly returning the area array.
You are correct, and you are forgiven. π Not like I caught it anyway.
I'm really hoping to ship these features soon.
from xesmf.
Also the weight file needs to have more complete metadata, not just numbers (can be done outside of ESMPy)
from xesmf.
@JiaweiZhuang / @bekozi - has there been any update on this issue? As I understand it, we need a way to have ESMPy return in memory weights. Has that issue been raised upstream?
from xesmf.
I raised this issue a while ago; not sure if there's any progress
also @rokuingh
from xesmf.
Talked with @rokuingh today, and we have a plan for moving this forward. I hope it won't be longer than a few weeks.
from xesmf.
That's great news @bekozi ! Is it planned for some patch version of 7.1.0 or 8.0.0?
from xesmf.
Good question. It will be in 8 but rolled out in a beta snapshot much sooner...
from xesmf.
Glad to see progress on this important issue! I have several comments & questions:
- What's in
factor_list
andfactor_index_list
? I guessfactor_list
is an 1D numpy array of weights (S
in the offline weights file). Isfactor_index_list
a 2D array containingcol
androw
, or a list of two 1D arrays? I guess it is a list so you needdeep_copy
, not justcopy
?
I would prefer just returning 3 numpy arrays and use copy
:
row, col, S = regrid.get_factors(copy=True)
Those arrays will then construct a Scipy sparse matrix by scipy.sparse.coo_matrix(S, (row, col), copy=False)
without copying.
- Zero-based indexing: Currently in xESMF I simply subtract 1 from default grid indices. Will the
start_index
implementation be more computationally efficient? If not, this correction can be done in Python level, in-place, so not extra copies will be made.
My general feeling is that direct ESMF/ESMPy calls should only deal with the very core functionalities that must be done in Fortran-level. Then we rarely, rarely need to update ESMPy's API. Once a new ESMPy function like get_factors()
is implemented, it can remain stable for many versions. Additional small fixes can be quickly done in the Python level, which is much easier to debug and update.
-
Naming: would
regrid.get_weights()
be more intuitive thanregrid.get_factors()
? -
For the sake of consistency, the
Field.get_area()
method should probably need acopy
option as well. It can indeed by done byField.get_area().copy()
, but ancopy
option is pretty common in numpy/scipy functions.
from xesmf.
With bigger grids, we want to minimize copies too.
A zero-copying workflow would be:
row, col, S = regrid.get_weights(copy=False)
row -= 1 # modify the Fortran array in-place. Any potential issue?
col -= 1
weights = scipy.sparse.coo_matrix(S, (row, col), copy=False)
regridder = xesmf.Regridder()
regridder.weights = weights # will be done in class __init__ instead
# do not call regrid.destroy()
In this case, the xesmf.Regridder
directly uses the internal Fortran arrays in the ESMF.Regrid
object. regrid.destroy()
releases the Fortran array, which will destroy the xESMF regridder as well. Is there any potential danger with this approach, even if explicit destroy()
is prohibited in xESMF user API? Will other ESMF error mess up the Fortran array? If so, the default behavior should be copy=True
followed by destroy()
, so the numpy array for weights is decoupled from ESMF; while copy=False
without destroy()
is only an option to minimize copying. The eventual memory use of the two cases are actually the same; the former uses more peak memory during copying.
from xesmf.
Thanks for the input! All good stuff.
Is factor_index_list a 2D array containing col and row, or a list of two 1D arrays? I guess it is a list so you need deep_copy, not just copy?
Out of habit, I prefer to be explicit between shallow and deep copies in Python even when working with NumPy arrays. However, we don't have a policy for naming these in ESMPy. I agree copy
is cleaner...need to do some thinking on this...
The function will return a np.float64
ndarray with shape (nfactors,)
for the factor list and an np.int32
ndarray with shape (2, nfactors)
for the factor index list. The first column in the factor index list is the source index with the second column being the destination index. We'll make sure the docs are very clear about this. My head spins sometimes when trying to map Fortran onto Python and vice versa.
With ESMPy, we have tried to stay close to how ESMF implements things under the hood. Hence, the factor and factor index list will take the same form as the regrid store call in ESMF. Mostly this is an attempt to reduce duplication with the docs.
I would prefer just returning 3 numpy arrays and use copy. Those arrays will then construct a Scipy sparse matrix by scipy.sparse.coo_matrix(S, (row, col), copy=False) without copying.
We could add a convenience function or options that would do this. We may want to expose the ESMF sparse matrix utility through Python at some point. In that case, we would want the factor array forms to be compatible with ESMF.
I thought at one point to consider a get_factors_dict
which would return a dictionary with keys (row, col, S) mapping to their associated values (numpy arrays). What do you think? This function would also take a copy parameter. We could definitely name this one get_weights_dict
as I expect it could be the preferred entry point for xESMF.
Will the start_index implementation be more computationally efficient?
I don't really know the answer to this. My sense is that it is not worth implementing as in-place NumPy stuff is pretty fast, but it was a question worth posing at least.
My general feeling is that direct ESMF/ESMPy calls should only deal with the very core functionalities that must be done in Fortran-level. Then we rarely, rarely need to update ESMPy's API.
This. π Summed up the ESMPy's design mentality. It doesn't feel Pythonic most of the time!
Naming: would regrid.get_weights() be more intuitive than regrid.get_factors()
It would. It's mostly an attempt to establish consistency with the named factor arguments to ESMF. I don't feel strongly about it though. In my mind, we should consider a get_weights
method that just passes through to get_factors
...
For the sake of consistency, the Field.get_area() method should probably need a copy option as well.
Good point. The copy flags to these functions are meant to be additional indicators to users that they need to be conscious of ESMF memory.
Is there any potential danger with this approach, even if explicit destroy() is prohibited in xESMF user API? Will other ESMF error mess up the Fortran array?
The safest bet is to deep copy everything as you know. Code that does not explicitly call destroy but creates numerous ESMF.Regrid(..., factors=True)
objects in the same scope can get some serious memory accumulations. One option could be to introduce a context manager for ESMF.Regrid
to ensure destroys are appropriately called. I hope to answer this better with additional testing.
I'm not quite sure what you mean by other ESMF errors. I think once these arrays are created they should be stable independent of other ESMF usage (in ESMF these arrays are not attached to anything). The same goes for the deallocation. I'm probably not understanding your second question.
from xesmf.
The function will return a np.float64 ndarray with shape (nfactors,) for the factor list and an np.int32 ndarray with shape (2, nfactors) for the factor index list.
We could add a convenience function or options that would do this.
(2, nfactors)
looks good to me. No need to change it or add additional wrappers. The ESMPy API can be as lean as possible.
A minor concern is that factor_index_list
probably has Fortran-ordering as other ESMPy output, so row = factor_index_list[0,:]
and col = factor_index_list[1,:]
are not contiguous in memory. This slows down the sparse MM by ~40%. See this demo. It can be fixed by copying and rearranging the memory layout, anyway.
It would. It's mostly an attempt to establish consistency with the named factor arguments to ESMF. I don't feel strongly about it though. In my mind, we should consider a get_weights method that just passes through to get_factors...
It makes perfect sense for ESMPy to stay close to ESMF. I am OK with the current name. Multiple aliases to the same method will probably confuse pure-ESMPy users.
I thought at one point to consider a get_factors_dict which would return a dictionary with keys (row, col, S)
Looks good me if you don't mind adding & maintaining one more functionπ
The safest bet is to deep copy everything as you know. Code that does not explicitly call destroy but creates numerous ESMF.Regrid(..., factors=True) objects in the same scope can get some serious memory accumulations.
This is exactly what I am concerning about. I am not super familiar with ESMPy's memory management, but from what you said I guess repeated calls to regrid = ESMF.Regrid(...)
without regrid.destroy()
will lead to lots of memory leak. Even though the old ESMF.Regrid
object is not referenced by any variable identifiers, the memory will not be freed (in contrast to the automatic garbage collection for normal numpy arrays).
If true, with the zero-copying implementation mentioned above, there will be memory leak if a user repeatedly calls regridder=xesmf.Regridder(...)
. With copying and explicit destroy()
, the regridder
will only hold normally-created numpy arrays, and when theregridder
variable identifier is overwritten (i.e. points to a new regridder object), the old memory should be automatically garbage-collected.
I'm not quite sure what you mean by other ESMF errors.
I guess I more mean "mis-uses" (like the above case) than "errors"...
In summary:
- Current API and naming conventions are all fine to me, as I understand that many of them come from ESMF conventions.
- I don't particularly need any convenience wrappers (a stable and simple API is more important to me, not matter how it looks). But if you want to add some wrappers for pure-ESMPy users, I am happy to take advantage of them.
- xESMF should make copies of the weights and then explicitly destroy the ESMF object, to safely manage memory. Directly referencing the Fortran array might be an (unrecommended) option for extremely large grids.
from xesmf.
@bekozi My mistake -- field.get_area()
is to compute the area and add to field.data
, not directly returning the area array. So it behaves differently from get_factors()
and there is no way to add copy
option...
from xesmf.
Quick update. We hit an impasse with compilers that we are working through. This was not unexpected. The implementation is working with the standard GNU compiler stack but is unstable on others (i.e. Discover standard modules). The rub with this feature is that the weight arrays are dynamically allocated in Fortran complicating pointer management. @rokuingh and I are working through how best to move the arrays up to Python using Fortran's iso_c_binding
module. We used a temporary solution to allow Python development to move forward.
You can see the ESMPy
method for retrieving the weight dictionary on SF if you like. The good news is that deep copying out of this method will result in contiguous memory from what I can tell.
from xesmf.
Thanks very much for the work! I think all xESMF users use conda (with GNU compiler) and won't be affected by this compiler issue.
from xesmf.
@bekozi Aha, finally!!
I can use the Dockerfile without problems. Will let you know if I hit any issues.
from xesmf.
@bekozi One minor documentation thing: create_rh
is not documented in ESMF.Regrid
's docstring.
Simply using the one in ESMF docs should be fine:
create_rh
Specifies whether or not to create a routehandle, or just write weights to file. If not specified, defaults to ESMF_TRUE.
I assume that it should be set to False
if the Regrid call is only for weight retrieval, either offline (filename = '...'
) or online (factors=True
) .
from xesmf.
One minor documentation thing: create_rh is not documented in ESMF.Regrid's docstring.
Noted! Thanks.
I assume that it should be set to False if the Regrid call is only for weight retrieval, either offline (filename = '...') or online (factors=True) .
Yes, it makes sense to use create_rh=False
for best performance when returning factors. This should be noted in the documentation. I'll update the test loop too. Edit: It looks like this is already tested appropriately.
from xesmf.
I already have a mostly-working code for in-memory weight retrieval, but it relies on ESMF 8.0.0 that is not publicly released yet. So far I've been building the development branch of ESMF at https://github.com/JiaweiZhuang/Docker_ESMPy. Will push my new code when ESMPy/ESMF 8.0.0 becomes conda-installable.
This is of top-priority as the current offline weight approach is quite awkward and hard to test.
from xesmf.
ESMF 8.0.0 is released today and supports online weight retrieval (see release notes):
The following capabilities were added to the ESMF Python interface (ESMPy):
Added an in-memory weight generation option to the Regrid class, allowing re-use of weight vectors without writing them to netCDF. The weight arrays can be returned as NumPy objects or Python dictionary of weight vectors. This allows retrieval of the weights by source and destination key/value pairs.
from xesmf.
Related Issues (20)
- After conda update, problem with cf_xarray: "Receive multiple variables for key 'longitude': ['i', 'lon']" HOT 4
- installation impossible with anaconda HOT 3
- MOAB support in ESMF 8.1.0 HOT 1
- repo not maintained see https://github.com/pangeo-data/xESMF
- ValueError: Dimensions {'bounds'} do not exist. Expected one or more of ('lon', 'bnds') HOT 1
- Example of interpolating 6 tiles cube-sphere grid to lat-lon grid HOT 2
- Error writing log file HOT 1
- xESMF regridding showing repeated values at latitude poles? HOT 2
- xESMF error after 12+ hour script run in downscaling grid/upsampling data HOT 2
- Misrepresentation of the number of precipitation events HOT 1
- ImportError: The ESMF shared library did not load properly. HOT 1
- How does xesmf treats grid cell with Nan values? HOT 1
- Hi, want to regrid wrfout (rainc+rainnc) data HOT 1
- ModuleNotFoundError: No module named 'ESMF' HOT 6
- ValueError: {'longitude', 'latitude'} are not variables in the underlying object. {'longitude', 'latitude'} are dimensions with no index.
- Using xesmf to efficiently regrid data to another resolution including vertical dimension
- Issue xe.regridder(); cannot reshape array of size A*B*C*D into shape (A, B, C, D) HOT 1
- Not Found for url
- pytest reports PIO errors HOT 1
- regrid velocity vector from tripolar grid to regular lat-lon grid
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xesmf.