Giter Club home page Giter Club logo

iris-esmf-regrid's People

Contributors

bjlittle avatar dependabot[bot] avatar esadek-mo avatar github-actions[bot] avatar hgwright avatar jamesp avatar lbdreyer avatar pre-commit-ci[bot] avatar stephenworsley avatar trexfeathers avatar zklaus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iris-esmf-regrid's Issues

Enable Python 3.9 and 3.10

โœจ Feature Request

It would be great if we could get Python 3.9 and Python 3.10 builds for iris-esmf-regrid. At the moment, the maximum version is 3.8, but it is unclear to me whether that is simply because that was the version available when the CI was setup or for a deeper reason.

Does anyone know?

Handle over-180-degrees cells without treating edges as latlon-aligned

Seeing comment here

In addition, ESMF cannot handle any cells covering >=180 degrees, e.g. longitudes -180..+180 : such cases will give errors : These cases can be fixed by specifying resolution=2.

While that statement is all very well, it is in that case simply not doing the same calculation.

So, suppose I have a cell with an edge that runs (-120, -30) to (+120, +30) : I might actually want that line to be on a great circle, and more especially if the boundaries are not generally along lines of latitude or longitude.
I can in principle achieve that by taking a different intermediate point, which lies on the GC instead of the 'straight latlon' line.
But obviously, this is not what the resolution keyword provides.

In fact, this could occur any time a target may have cells which are "large" but not specially latlon-aligned, and it particular when their edges are supposed to be on GCs rather than some 'straight latlon' line.

In effect, this is due to the conflation of the "two problems" I mentioned in the above comment : that of handing overlarge cells, and of approximating 'straight latlon' boundary lines.

So "ideally", we could provide an additional keyword that controls whether the boundaries of over-large cells are divided along a latlon line, or on a GC.
E.G. a new key likeresolve_latlon=True, for both GridToMeshESMFRegridder and MeshToGridESMFRegridder

Benchmarks failing to run.

๐Ÿ“ฐ Custom Issue

See the failing CI.

I expect I baked in some inflexibility as this setup was my first attempt at GitHub-integrated benchmarking. I'd like to replicate the further developed architecture we have in Iris (see here), which would solve the issue and otherwise improve matters. I'll see if I can find time in the coming weeks.

Add ESMFAreaWeighted Iris scheme

โœจ Feature Request

Fill in the class ESMFAreaWeighted so that it provides enough support to allow the regridding of Iris cubes. The scope of this issue only covers the most basic kind of regridding (2D cubes with rectilinear grids and no masked data).

Motivation

The Iris schemes will provide the front end of this package. Providing a rough version of this early on in development will allow for easier development later on by allowing experimentation with grids from different cubes.

GridToMeshESMFRegridder: Allow use of mesh objects for target mesh

โœจ Feature Request

It would be incredibly useful when generating regridders for GridToMeshESMFRegridder to be able to use mesh objects rather than requiring a cube for the target.

Motivation

This is motivated by the need to generate ancillary files for running LFRic. In NGMS an LFRic mesh generator is used to provide mesh files for how the model should be set up. These are loadable by iris using the load_mesh routine and all seem correct, etc. However, when it comes to constructing a regridder you can't use them in operations like: rg = GridToMeshESMFRegridder(regridme, targetmesh) as you get the error: AttributeError: 'Mesh' object has no attribute 'mesh'. Thus, in order to use the mesh we would then need to write code to generate an empty cube to attach the mesh to. This is redundant and unnecessary boilerplate.

@stephenworsley

Slow performance for longitude means

๐Ÿ› Bug Report

When regridding onto a cube whose latitude coordinate has bounds [[-90, 90]] and whose longitude coordinate has many points/bounds, initialising a regridder takes longer than is expected. In order to work correctly, this regridding must be done with a resolution keyword, however internal logic currently means that this is effectively always resolution=2. It seems like ESMF doesn't like long thin cells, so it might be worth rethinking this approach.

Lat-lon dimension ordering switched

๐Ÿ› Bug Report

As demonstrated by the code included at the end of #24, data on grids is returned with the dimensions ordered longitude-latitude. According to CF spec, the tendancy is for dimensions to be ordered latitude-longitude. This is the convention used by Cartopy for example, which is why the transpose needed to be taken in order to plot properly with Cartopy. The code should be changed to respect the default CF dimension ordering.

Allow non-degree units

๐Ÿ“ฐ Custom Issue

If a coord system is not described natively in degree units it should still be possible to regrid it.

`iris.cube.Cube.regrid()`-compatible scheme for `MeshToGridESMFRegridder`

โœจ Feature Request

It would be great to have a regridding scheme that can be recognized by iris.cube.Cube.regrid() based on MeshToGridESMFRegridder (just like ESMFAreaWeighted for ESMFAreaWeightedRegridder).

Motivation

In ESMValTool, we use iris.cube.Cube.regrid() for regridding; thus, a mesh-to-grid (or the other way round) scheme that can be recognized by this function would be great!

Thanks for all the great work on this package!

Automatic Airspeed Velocity benchmark results publish

โœจ Feature Request

Create an automated job to run asv gh-pages --rewrite, which uses the GitHub Pages infrastructure to create an ASV results dashboard (example).

Motivation

Easy visualisation of this project's performance over its commit history, in a shared standardised way, will aid in development and collaboration. It also improves the project's image by showing potential collaborators/users that coverage extends to routine performance monitoring.

NOTE: benchmarking is already part of the CI 'test harness', but this only produces numeric results and skips any particularly slow benchmarks.

More detail in here!

Expand for implementation ideas...

This was previously attempted but removed due to permissions/security concerns - see #88. It seems that a dedicated GitHub Action will probably be the right implementation - here is a good example.

Virtual machines that use shared resource - including those used for GHA, Cirrus and most/all other CI solutions - are vulnerable to variable performance, which is very unhelpful for performance testing! There are steps we should take to help compensate:

  • Publish results from a single continuous ASV run.
    • As opposed to the example above that forces ASV to recognise the machine as the same one each time, allowing just benchmarking new commits and appending their results to existing ones.
    • Avoids the following performance variability:
      • Variability between multiple GHA runs.
      • Variability on long time scales (e.g. GHA demand at different times of day).
      • Variability from GitHub periodically changing/upgrading their Actions architecture.
    • This will result in long run times. I initially suggest a full run+publish after every PR merge, but if this takes prohibitively long we can consider scheduling nightly/weekly/... runs.
  • Work with Airspeed Velocity's design to reduce the impact of noise.
    • ASV automatically performs repeats and applies statistics specifically with noise in mind.
    • A variety of configurability is on offer to further compensate for within-run noise if it is still a problem.
      Some of these could also be useful for the existing CI benchmarking (see the CLI commands and the benchmark scripts).

Ensure regridders are pickleable

โœจ Feature Request

During development of #100, it became necessary for regrridder objects to be picklable. This was blocked for grids defined on certain coordinate systems (rotated pole) due to containing a cartopy projection object which was not picklable. This object is assigned here:

One way around this would be to instead hold an object which generates an appropriate cartopy projection, e.g an iris CoordSystem.

mint_comparison.py crashes with esmpy==8.4.0

๐Ÿ› Bug Report

I have installed the latest
esmf 8.4.0 mpi_mpich_h54662ac_101 conda-forge
esmf-regrid 0.5.0 pypi_0 pypi
esmpy 8.4.0 mpi_mpich_py39h027448b_101 conda-forge
iris 3.4.0 pyhd8ed1ab_0 conda-forge
and I'm getting
(mint-iris-esmf) NIWA-1008648~/iris-esmf-regrid-edge-regridding/esmf_regrid/demos$ python mint_comparison.py
libc++abi: terminating with uncaught exception of type int
Abort trap: 6

How To Reproduce

Steps to reproduce the behaviour:

  1. forked [email protected]:stephenworsley/iris-esmf-regrid.git branch edge_regridding
  2. conda create -n mint-iris-esmf python=3.9
  3. conda activate mint-iris-esmf
  4. conda install -c conda-forge scitools-iris esmpy
  5. pip install -e .
  6. cd esmf_regrid/demos
  7. python mint_comparison.py

Expected behaviour

I did not expect "python mint_comparison.py" to throw an exception

Screenshots

Environment

  • OS and version: Darwin NIWA-1008648.niwa.local 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:17:10 PDT 2022; root:xnu-8020.140.49~2/RELEASE_X86_64 x86_64
  • esmf_regrid version: [e.g., From the command line run python -c "import esmf_regrid; print(esmf_regrid.__version__)"]

Additional context

Click to expand this section...
Add additional verbose information in a collapsible section.

See here for further details.

GridToMeshESMFRegridder: TypeError: buffer is too small for requested array

๐Ÿ› Bug Report

@stephenworsley - When working with large source data and trying to generate a regridder am hitting TypeError: buffer is too small for requested array

How To Reproduce

Steps to reproduce the behaviour:

  1. Start a SPICE interactive session with 180GB memory (not actually necessary but gives sufficient overhead to prove this isn't a memory availability problem)
  2. module load scitools
  3. set: ulimit -u unlimited
  4. start python
  5. load in large source file (please contact separately for path)
  6. load in a C48 target cube
  7. generate regridder
  8. wait for traceback to be generated

Expected behaviour

regridder to be generated

Environment

  • SPICE interactive
  • scitools/default-current
  • esmf_regrid version: 0.5.0

Additional context

We need to be able to regrid this particular large source file to a range of UGrid resolutions (target C896 for now) in order to generate ancillary files for starting LFRic. An example script and source files can be provided on request but am not able to advertise on a public repo.

Cannot load a minimalistic ugrid file

๐Ÿ› Bug Report

Cannot load a mesh from a ugrid file that contains:

variables:
int Mesh2d ;
Mesh2d:cf_role = "mesh_topology" ;
Mesh2d:edge_node_connectivity = "Mesh2d_edge_nodes" ;
Mesh2d:face_edge_connectivity = "Mesh2d_face_edges" ;
Mesh2d:face_node_connectivity = "Mesh2d_face_nodes" ;
Mesh2d:long_name = "Topology data of 2D unstructured mesh" ;
Mesh2d:node_coordinates = "Mesh2d_node_x Mesh2d_node_y" ;
Mesh2d:topology_dimension = 2 ;
int Mesh2d_face_edges(nMesh2d_face, nMesh2d_vertex) ;
...

How To Reproduce

Steps to reproduce the behaviour:

  1. Download https://github.com/pletzer/mint/blob/master/data/lfric_diag_wind.nc
  2. Run the script below

import iris
from iris.experimental.ugrid import PARSE_UGRID_ON_LOAD
with PARSE_UGRID_ON_LOAD.context():
mesh = iris.load_cube("lfric_diag_wind.nc")

You'll get

(iris-dev) niwa-35791~/iris-test$ python terror.py
Traceback (most recent call last):
File "terror.py", line 4, in
mesh = iris.load_cube("lfric_diag_wind.nc")
File "/Users/pletzera/iris/lib/iris/init.py", line 341, in load_cube
cubes = _load_collection(uris, constraints, callback).cubes()
File "/Users/pletzera/iris/lib/iris/init.py", line 276, in _load_collection
result = _CubeFilterCollection.from_cubes(cubes, constraints)
File "/Users/pletzera/iris/lib/iris/cube.py", line 104, in from_cubes
for cube in cubes:
File "/Users/pletzera/iris/lib/iris/init.py", line 261, in _generate_cubes
for cube in iris.io.load_files(part_names, callback, constraints):
File "/Users/pletzera/iris/lib/iris/io/init.py", line 208, in load_files
for cube in handling_format_spec.handler(
File "/Users/pletzera/iris/lib/iris/fileformats/netcdf.py", line 974, in load_cubes
mesh_coords, mesh_dim = _build_mesh_coords(mesh, cf_var)
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/load.py", line 492, in _build_mesh_coords
mesh_coords = mesh.to_MeshCoords(location=cf_var.location)
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 1972, in to_MeshCoords
result = [
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 1973, in
self.to_MeshCoord(location=location, axis=ax) for ax in self.AXES
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 1947, in to_MeshCoord
return MeshCoord(mesh=self, location=location, axis=axis)
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 2835, in init
points, bounds = self._construct_access_arrays()
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 3046, in _construct_access_arrays
points_coord = self.mesh.coord(include_edges=True, axis=axis)
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 1614, in coord
result = self._coord_manager.filter(
File "/Users/pletzera/iris/lib/iris/experimental/ugrid/mesh.py", line 2283, in filter
raise CoordinateNotFoundError(emsg)
iris.exceptions.CoordinateNotFoundError: 'Expected to find exactly 1 coordinate, but found none.'

Expected behaviour

One should be able to load a ugrid file with variables that don't have coordinates. This would be expected for "extensive" variables, which are defined over the entire cell, face or edge. (All nodal variables are expected, however, to have coordinates.)

Screenshots

N/A

Environment

  • OS and version: Darwin niwa-35791.niwa.local 21.5.0 Darwin Kernel Version 21.5.0: Tue Apr 26 21:08:22 PDT 2022; root:xnu-8020.121.3~4/RELEASE_X86_64 x86_64
  • esmf_regrid version: [e.g., From the command line run python -c "import esmf_regrid; print(esmf_regrid.__version__)"]: 0.5.dev0

Additional context

Click to expand this section...
Add additional verbose information in a collapsible section.

See here for further details.

Rudimentary docs

โœจ Feature Request

Rudimentary docs

Motivation

It would be useful to have some very basic docs to show how to install iris-esmf-regrid (I guess just conda or pip install) and how to do a very basic regrid please. That should be enough to get started and the rest can be understood from the source.

Additional context

Click to expand this section...

As suggested in SciTools/iris#4463

Apply more rigorous checks on input cubes

Currently, checks on cubes and coordinates is provisional and potentially not exhaustive. Assert statements are used rather than raising appropriate errors. this has been defered from PRs like #125 to be its own issue. Regridding schemes ought to have a similar level of checks as exists in iris.

regrid_unstructured_to_rectilinear() fails if cube is chunked along a non-horizontal dimension

๐Ÿ› Bug Report

Firstly, apologies for reporting a bug in a non-main branch. I am using the unstructured_scheme branch to work with LFRic output.

A ValueError occurs if I apply esmf_regrid.experimental.unstructured_scheme.regrid_unstructured_to_rectilinear() on cubes that are chunked along the time and/or vertical dimension. This situation is quite common when cubes are loaded from multiple files, so I would really appreciate if you fix this bug or provide a workaround.

My current workaround is to rechunk the data and merge all chunks into one. It works for now, but can become a memory problem for large datasets.

Interestingly, having chunks along the horizontal (mesh) dimension works fine.

Thanks.

How To Reproduce

Steps to reproduce the behaviour:

import os

import iris
from iris.experimental.ugrid import PARSE_UGRID_ON_LOAD

from esmf_regrid.experimental.unstructured_scheme import (
    regrid_unstructured_to_rectilinear,
)

test_data_dir = iris.config.TEST_DATA_DIR
src_fn = os.path.join(
    test_data_dir, "NetCDF", "unstructured_grid", "lfric_ngvat_3D_1t_half_level_face_grid_derived_theta_in_w3.nc"
)
with PARSE_UGRID_ON_LOAD.context():
    theta = iris.load_cube(src_fn, "air_potential_temperature")

# Rechunk the dask array
theta_chunked = theta.copy(data=theta.core_data().rechunk([1, (19, 19), 864]))

# Load target grid cube.
tgt_fn = os.path.join(
    test_data_dir, "NetCDF", "global", "xyt", "SMALL_hires_wind_u_for_ipcc4.nc"
)
tgt = iris.load_cube(tgt_fn)

# Perform regridding.
result = regrid_unstructured_to_rectilinear(theta_chunked, tgt)
This results in a `ValueError`: "Dimension 1 has 1 blocks, adjust_chunks specified with 2 blocks"
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_23260/3888724239.py in <module>
     16 
     17 # Perform regridding.
---> 18 result = regrid_unstructured_to_rectilinear(theta_chunked, tgt)

~/mambaforge/envs/ideal_exo/lib/python3.8/site-packages/esmf_regrid/experimental/unstructured_scheme.py in regrid_unstructured_to_rectilinear(src_cube, grid_cube, mdtol)
    344     """
    345     regrid_info = _regrid_unstructured_to_rectilinear__prepare(src_cube, grid_cube)
--> 346     result = _regrid_unstructured_to_rectilinear__perform(src_cube, regrid_info, mdtol)
    347     return result
    348 

~/mambaforge/envs/ideal_exo/lib/python3.8/site-packages/esmf_regrid/experimental/unstructured_scheme.py in _regrid_unstructured_to_rectilinear__perform(src_cube, regrid_info, mdtol)
    285     # chunks cover the entire horizontal plane (otherwise they would break
    286     # the regrid function).
--> 287     new_data = _map_complete_blocks(
    288         src_cube,
    289         regrid,

~/mambaforge/envs/ideal_exo/lib/python3.8/site-packages/esmf_regrid/experimental/unstructured_scheme.py in _map_complete_blocks(src, func, dims, out_sizes)
     87         pass
     88 
---> 89     return data.map_blocks(
     90         func, chunks=out_chunks, drop_axis=dropped_dims, dtype=src.dtype
     91     )

~/mambaforge/envs/ideal_exo/lib/python3.8/site-packages/dask/array/core.py in map_blocks(self, func, *args, **kwargs)
   2394     @wraps(map_blocks)
   2395     def map_blocks(self, func, *args, **kwargs):
-> 2396         return map_blocks(func, self, *args, **kwargs)
   2397 
   2398     def map_overlap(self, func, depth, boundary=None, trim=True, **kwargs):

~/mambaforge/envs/ideal_exo/lib/python3.8/site-packages/dask/array/core.py in map_blocks(func, name, token, dtype, chunks, drop_axis, new_axis, meta, *args, **kwargs)
    739         adjust_chunks = None
    740 
--> 741     out = blockwise(
    742         func,
    743         out_ind,

~/mambaforge/envs/ideal_exo/lib/python3.8/site-packages/dask/array/blockwise.py in blockwise(func, out_ind, name, token, dtype, adjust_chunks, new_axes, align_arrays, concatenate, meta, *args, **kwargs)
    260                 elif isinstance(adjust_chunks[ind], (tuple, list)):
    261                     if len(adjust_chunks[ind]) != len(chunks[i]):
--> 262                         raise ValueError(
    263                             f"Dimension {i} has {len(chunks[i])} blocks, adjust_chunks "
    264                             f"specified with {len(adjust_chunks[ind])} blocks"

ValueError: Dimension 1 has 1 blocks, adjust_chunks specified with 2 blocks

Expected behaviour

Regridding that works on an arbitrarily chunked array.

Environment

  • OS and version: Ubuntu 20.04 LTS
  • esmf_regrid version: 0.3.dev0
  • iris version: 3.2.dev0

Refactor unstructured scheme tests

๐Ÿ“ฐ Custom Issue

The structure of tests in tests.unit.experimental.unstructured_scheme (in the unstructured_scheme feature branch) is currently messy. Especially with respect to helper functions such as _grid_cube which exists in test__cube_to_GridInfo and _flat_mesh_cube which exists in test__regrid_unstructured_to_rectilinear__prepare. After a refactor, these should exist in a common place. It is also worth investigating to what extent pytest fixtures can improve the structure of the tests.

Rename the module

๐Ÿ“ฐ Custom Issue

The current module name esmf_regrid does not match the name of the github repo, or the package on conda-forge.

Questions:

  1. Should we rename something, if so, what? (module, repo, conda-forge)
  2. Should we namespace inside iris?

Use `enum` around the codebase

There are several examples in iris-esmf-regrid of 'modes', e.g. conservative/bilinear, source/target. So there are lots of checks for == "conservative" etcetera. These are vulnerable to spelling mistakes, and force object signatures to accept free text when there are only a few valid values.

These are prime cases for using Python's enum module instead. Once in place, code will be easier to follow and to modify, with fewer mistakes.

Support curvilinear grids

โœจ Feature Request

Add support for grids defined by 2D lat/lon coords in the latlon to latlon regridder.

Add real integration and gradient field testing

โœจ Feature Request

Add integration testing so that KGO is routinely tested

Integration tests are needed for the following:

  • Real data: From the iris-test-data repository:
    Testing the regridding of: Tripolar (ORCA) grid, Lambert-Azimutal-Equal-Area, Rotated pole, and ugrid data. (Tripolar is the most important).

  • Gradient field data: Should enable regidding of different resolutions to be easily assessed

KGO:

  • The known-good-output should be checking the exact data values (rather than visualisation of the regridding).
  • This could be implemented in line with the methods currently used in Iris.

Drop support for python 3.6

  • Iris 3.1 dropped support for python <3.7.
  • Iris 3.2 drops support for python <3.8.

This project should probably also move it's minimum requirement to one of these two options.

Edit: Fixed python version numbers

ESMF regrid fails when there is more than one variable on an axis

๐Ÿ› Bug Report

When using Iris ESMF-regid to regrid data such as rotated pole data that includes auxiliary coords of true longitude two coords are returned not a single one

How To Reproduce

Steps to reproduce the behaviour:

  1. load CORDEX data on rotated pole grid wtih lat and lon 2D coords e.g cube = iris.load_cube('pr_AFR-22_NCC-NorESM1-M_historical_r1i1p1_CLMcom-KIT-CCLM5-0-15_v1_day_19500101-19501231.nc')
  2. outcube2 = cube.regrid(newcube, ESMFAreaWeighted())

Expected behaviour

the aux coord should be ignored in favour of the dim coord

Screenshots

Environment

  • MO VDI RHEL7
  • esmf_regrid version:'0.4.0'

Additional context

Click to expand this section...

We believe this can be resolved by having iris-esmf-regrid search Coords using dim_coord=True in these searches:

tgt_x = tgt_grid_cube.coord(axis="x")
tgt_y = tgt_grid_cube.coord(axis="y")
src_x = src_grid_cube.coord(axis="x")
src_y = src_grid_cube.coord(axis="y")

It should also fall back on the current implementation by using a try-except block.

The only scenario producing an error would then be no DimCoords and multiple AuxCoords on the same axis, which is truly ambiguous and would need the user to help by massaging the input.

See here for further details.

List and prioritise possible features for the project

The purpose of this issue is to record and discus possible features we would want/need for this project and decide which are are necessary and which are desirable.

Below is a categorised list of features, feel free to add more that I might have missed. Features which I think are optional I have marked with a "(?)".

Scope of support:

  • Regridders will be exposed as Iris schemes. These schemes will be able to handle cubes with more dimensions than just the grid.
  • Multiple ESMF RegridMethods will be supported including:
    • CONSERVE
    • BILINEAR
    • NEAREST_STOD (source to destination) (?)
  • Rectilinear and curvilinear grids can be represented as ESMF grids to be regrid.
  • Code will maintain flexibility to allow support for ESMF Meshes to be added easily.

Functionality of building the regridder:

  • Any use of ESMF should be restricted to the building of the regridder and the built regridder should contain no ESMF objects. The built regridder should contain only Python objects and sparse matrices.
  • The regridder should be "cachable". It should be possible to either save and load a regridder, or else to extract savable information and rebuild a regridder from that information.
  • The regridder should be able to be built in an environment without ESMF using only cached information. (?)
  • The building of the regridder should be able to be done (subgrids passed to ESMF) in chunks in order to save memory (and perhaps allow parallelisation). (?)

Functionality of the built regridder:

  • The built regridder should be able to allow some form of multiprocessing (i.e. Dask).
  • The built regridder should be able to handle masked data.
  • It should be possible to choose between a regridder with a fixed mask or a flexible mask, one with a fixed mask will handle masking during the build step (and have improved performance), one with a flexible mask will handle masking after the build step.
  • There should be an Iris style mdtol setting for conservative regridding.

Adopt Typing

โœจ Feature Request

As this project is still in its infancy, I suggest that now is a good time to consider a strategy on adopting Python typing support for esmf-regrid.

Motivation

The benefits of typing are obvious in terms to the developer, integration with modern IDEs, and additionally for documentation.

Also see:

Additional context

Click to expand this section...
Add additional verbose information in a collapsible section.

See here for further details.

Handle discontiguities and "bad geometries" hiding under masks

๐Ÿ“ฐ Custom Issue

In many kinds of (curvilinear) grids, often ORCA grids, discontiguous bounds cause regridding to fail. If we were to ignore the check on contiguous bounds, this leads to the creatioj of a grid with "bad geometries" which causes problems for ESMF. In these cases, we tend to find that these discontiguous cells tend to be masked. It could be possible to convert such grids to a form which properly ignores these bad cells and has ESMF calculate only for the good cells. This could be done either by searching for discontiguities or by using the mask to determine which cells to ignore. Since we don't want regridders to depend on the data in general, this ignoring of masked cells ought to be controlled by a keyword.

One possible approach for handling of unmasked curvilinear cells would be to convert them into some kind of MeshInfo object, or else allow a GridInfo object to be represented as an ESMF mesh containing only the good cells. Alternatively, it's worth investigating the add_mask keyword for the ESMPy object ESMF.Grid to see if this works in a way which would solve this problem.

Experimental sparse implementation

โœจ Feature Request

Have an alternative Regridder class in the experimental directory. This will effectively be the implementation in #26. It should then be possible to develop regridding schemes which can choose which regridder class ( and thereforesparse matrix implementation) to use.

Setup testing/CI scaffolding

I think we need some testing and CI stuff. Then we can start by adding test cases for simple regridding based on common grids (such as tripolar, regular, reduced gaussian, ...)

Regridding error due to non-contiguous coordinates

๐Ÿ› Bug Report

When working with datasets such as EC-Earth, that outputs ocean variables using the NEMO grid, iris-esmf fails to regrid the data due to an assertion error about coordinate bounds not being contiguous:

Traceback (most recent call last):
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_main.py", line 499, in run
    fire.Fire(ESMValTool())
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_main.py", line 443, in run
    process_recipe(recipe_file=recipe, config_user=cfg)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_main.py", line 124, in process_recipe
    recipe.run()
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_recipe.py", line 1882, in run
    self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks'])
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_task.py", line 725, in run
    self._run_parallel(max_parallel_tasks)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_task.py", line 768, in _run_parallel
    _copy_results(task, running[task])
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_task.py", line 791, in _copy_results
    task.output_files, task.products = future.get()
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_task.py", line 796, in _run_task
    output_files = task.run()
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/_task.py", line 263, in run
    self.output_files = self._run(input_files)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/preprocessor/__init__.py", line 599, in _run
    product.apply(step, self.debug)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/preprocessor/__init__.py", line 410, in apply
    self.cubes = preprocess(self.cubes, step,
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/preprocessor/__init__.py", line 346, in preprocess
    result.append(_run_preproc_function(function, item, settings,
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/preprocessor/__init__.py", line 302, in _run_preproc_function
    return function(items, **kwargs)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmvalcore/preprocessor/_regrid.py", line 628, in regrid
    cube = cube.regrid(target_grid, loaded_scheme)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/iris/cube.py", line 4177, in regrid
    regridder = scheme.regridder(self, grid)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmf_regrid/schemes.py", line 317, in regridder
    return ESMFAreaWeightedRegridder(src_grid, tgt_grid, mdtol=self.mdtol)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmf_regrid/schemes.py", line 346, in __init__
    regrid_info = _regrid_rectilinear_to_rectilinear__prepare(src_grid, tgt_grid)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmf_regrid/schemes.py", line 173, in _regrid_rectilinear_to_rectilinear__prepare
    srcinfo = _cube_to_GridInfo(src_grid_cube)
  File "/gpfs/projects/bsc32/software/suselinux/11/software/ESMValTool/2.5.0-exp/lib/python3.8/site-packages/esmf_regrid/schemes.py", line 44, in _cube_to_GridInfo
    assert lon.is_contiguous()
AssertionError
2022-06-21 10:50:18,830 UTC [781294] INFO    esmvalcore._main:515 

More information can also be found in this issue: ESMValGroup/ESMValTool#2661 . This is probably due to the grid description in the poles. FYI @pepcos

How To Reproduce

Steps to reproduce the behaviour:

  1. Try to regrid an irregular grid datasets that uses NEMO as the ocean model

Expected behaviour

To be able to regrid the datasets

Screenshots

Environment

  • OS and version: [e.g., Ubuntu 20.04 LTS]
  • esmf_regrid version: [e.g., From the command line run python -c "import esmf_regrid; print(esmf_regrid.__version__)"]

Additional context

Click to expand this section...
Add additional verbose information in a collapsible section.

See here for further details.

Support masked arrays

โœจ Feature Request

Provide support for regridders to handle masked arrays. Regridders will allow control over the masked data tolerance, similar to iris area weighted conservative regridders.

Motivation

Data on iris cubes is commonly masked. Furthermore, the regridding of masked data is a situation where the user may want more control over the regridding process for performance reasons. The handling of masked data is central to the design and motivation for this package.

Add docstrings

โœจ Feature Request

Docstrings are incomplete for the Regridder class. These should be added where necessary.

Add skeleton of user guide

โœจ Feature Request

Add a number of sections with very minimal descriptions or placeholders describing each of the features in iris-esmf-regrid.
Decide on the structure of the user guide, for example, ultimately the user guide ought to have:

  • Description of all the key word arguments in the regridders and a demonstration what they do.
  • A comparison of all the regridders/regridding methods and how their scopes differ (e.g. what cube does the bilinear regridder accept that the area weighted regridder does not)
  • Worked examples/gallery
  • dev notes
  • move changelog to whats new
  • install instructions
  • change the README to better reflect the contents of the user guide.

etc.

Note that for the time being we should aim for the minimal acceptable version of this.

Add grid-to-grid regridding scheme

โœจ Feature Request

Add a regridding scheme which maps a rectilinear cube onto a rectilinear cube. This can initially follow the pattern established here :https://github.com/SciTools-incubator/iris-esmf-regrid/blob/unstructured_scheme/esmf_regrid/experimental/unstructured_scheme.py (with a view to incorporating ideas from #26). Coppying this pattern is probably the quickest way to achieve a functional regridding scheme which can then be expanded to flesh out and restructure the scheme, incorporating ideas from #26 for example.

readthedocs build is failing

๐Ÿ› Bug Report

The readthedocs build has been failing since the last 3 months, we should:

  • fix this
  • add a RDT badge to the README.md so that it easily alerts developers

Add support for nearest neighbour regridding

โœจ Feature Request

Allow regridding using the NEAREST_STOD ESMF method.

Details

Masking

There multiple ways masking could be handled by such a regridder. Points could either be mapped to from the nearest unmasked source point, or else points whos nearest source is masked could become masked. For the time being, we are likely to opt for the latter since this should be simpler to implement with the current architecture. An option to use the other kind of masking behaviour could be added in future.

Source/Target

Another aspect of nearest neighbour regridding worth highlighting is the types of objects that can represent the source/target. While Conservative and Bilinear both require either Grids or Meshes, nearest neighbour can also take just a series of points as its source/target. This could be represented in Iris as a pair of 1D AuxCoords. This can be represented in ESMF as a LocStream. For the time being, it should be sufficient to simply match the level of support that Conservative and Bilinear have and add support for descriptions as series of points later.

Efficiency

Since the calculations of NEAREST_STOD regridding are simple, it should be possible to design a more efficient method for performing regridding calculations by using indexing rather than matrix multiplication. This is not necessary for a minimal delivery but could be upgraded later.

Investigate sparse package performance

๐Ÿ“ฐ Custom Issue

Currently, sparse matrices for regridding calculations are provided by scipy.sparse. An alternative approach is to use the sparse package and is described by #26. The motivation to switch to sparse is that it ought to better integrate with dask and therefore would provide performance enhancements. Before this switch is made, we will want to demonstrate that the performance is in fact improved. This will probably have to happen after #57 has established a pattern for performance testing.

`dtype` of regridded cube changes after realizing the data

๐Ÿ› Bug Report

After regridding a cube with dtype=float32 using regrid_unstructured_to_rectilinear and realizing the data the output cube's dtype is float64. If the data is saved and re-loaded before realizing the data, the dtype does not change.

How To Reproduce

import numpy as np

import iris
from iris.coords import DimCoord
from iris.cube import Cube
from iris.experimental.ugrid import PARSE_UGRID_ON_LOAD

import esmf_regrid
from esmf_regrid.experimental.unstructured_scheme import regrid_unstructured_to_rectilinear

print("iris version:", iris.__version__)
print("iris-esmf-regrid version:", esmf_regrid.__version__)

path = iris.sample_data_path('mesh_C4_synthetic_float.nc')
with PARSE_UGRID_ON_LOAD.context():
    cube = iris.load_cube(path)
    print("dtype input cube:", cube.dtype)

lat = DimCoord([10, 20, 30], standard_name='latitude', units='degrees_north')
lon = DimCoord([10, 20, 30], standard_name='longitude', units='degrees_east')
lat.guess_bounds()
lon.guess_bounds()
target = Cube(np.ones((3, 3)), dim_coords_and_dims=[(lat, 0), (lon, 1)])

regridded_cube = regrid_unstructured_to_rectilinear(cube, target)
print("dtype regridded cube before cube.data:", regridded_cube.dtype)

# If the following lines are added, the dtype is still float32 after realizing
# the data
# from pathlib import Path
# x = Path.home() / 'test.nc'
# iris.save(regridded_cube, x)
# regridded_cube = iris.load_cube(x)

regridded_cube.data
print("dtype regridded cube after cube.data:", regridded_cube.dtype)

prints

iris version: 3.4.0
iris-esmf-regrid version: 0.5.0
dtype input cube: float32
dtype regridded cube before cube.data: float32
dtype regridded cube after cube.data: float64

Expected behaviour

The dtype should be float32 for the regridding cube even after realizing the data.

Environment

  • OS and version: Red Hat Enterprise Linux 8.5 (Ootpa)
  • esmf_regrid version: 0.5.0

Regridder unification tasks

๐Ÿ“ฐ Custom Issue

The work being done in #198 can be separated into different pull requests:

  • Create a single generic regridder class capable of regridding from any kind of grid/mesh to any other. Have all other regridders derive from this one. A consequence of this refactoring is that ESMFAreaWeighted and ESMFAreaweightedRegridder will be able to act as a replacement for MeshToGridESMFRegridder/GridToMeshESMFRegridder (along with the new ESMFBilinear class). MeshToGridESMFRegridder and GridToMeshESMFRegridder are likely to be deprecated after their functionality if fully replaced (they remain the only regridders able to be saved/loaded). (#198)
  • #318
  • Refactor tests to better reflect the new locations of the code.
  • Add documentation describing the new regridders. (begun with #268)
  • Add support for mesh to mesh regridding. (#287 )
  • Add deprecation warnings for MeshToGridESMFRegridder and GridToMeshESMFRegridder.

Add is_compatible method

In the case where there are cubes with multiple types of grid/mesh and multiple regridders, it would be good to have a quick way to check if a particular cube is compatible with a particular regridder. This could be done with a is_compatible method which would take a cube as an argument and return True when that cube could be appropriately regrid by the regridder. It is also possible that we could refine the equality checks to be more performant and focused on the information relevant to each regridder e.g. only checking node lat/lon and face_node_connectivity for area weighted regridders.

Review the API docstrings

๐Ÿ“ฐ Custom Issue

Great to have the built API docs up.
But just roughly scanning, I reckon there is quite a bit in there that needs some attention / improvement.
For instance (in my personal views)..

  • esmf_regrid.schemes.ESMFAreaWeighted
    • I don't, personally, much fancy the use of the word "separated" for X and Y coords.
      You could say that the coordinate system is "separable", but I'm not sure this gives the right impression directly.
      I think the best approach in the Iris docs is to focus on the dimensions : to say that X and Y coordinates have separate dimensions
  • esmf_regrid.schemes.ESMFAreaWeightedRegridder
    • It says "Regridder class for unstructured to rectilinear Cubes.", but this seems at odds with "src_cube โ€“ A rectilinear instance of Cube"

So,
I'd suggest this could do with thorough general review, when we can find time.

I know from bitter experience (1) how very hard is to precisely + clearly describe these different types of coordinate mappings, given that nearly all the recognised technical terms are used ambiguously in different contexts, and (2) how confused users are about these distinctions in the first place
So, it's always an uphill struggle to define terms clearly.
But we can at least aim to be complete + consistent in our usage
-- which would at least be definite progress, since the existing Iris docs have made rather a mess of it !!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.