Giter Club home page Giter Club logo

pcr-globwb_model's Introduction

PCR-GLOBWB

PCR-GLOBWB (PCRaster Global Water Balance) is a large-scale hydrological model intended for global to regional studies and developed at the Department of Physical Geography, Utrecht University (Netherlands). This repository holds the model scripts of PCR-GLOBWB.

For additional information about the model please checkout the PCR-GLOBWB documentation at https://pcrglobwb.readthedocs.io/en/latest/.

Please also see the file README.txt.

Main reference/paper: Sutanudjaja, E. H., van Beek, R., Wanders, N., Wada, Y., Bosmans, J. H. C., Drost, N., van der Ent, R. J., de Graaf, I. E. M., Hoch, J. M., de Jong, K., Karssenberg, D., López López, P., Peßenteiner, S., Schmitz, O., Straatsma, M. W., Vannametee, E., Wisser, D., and Bierkens, M. F. P.: PCR-GLOBWB 2: a 5 arcmin global hydrological and water resources model, Geosci. Model Dev., 11, 2429-2453, https://doi.org/10.5194/gmd-11-2429-2018, 2018.

Input and output files (including OPeNDAP-based access: https://opendap.4tu.nl/thredds/catalog/data2/pcrglobwb/catalog.html)

The input files for the runs made in the aformentioned paper (Sutanudjaja et al., 2018) are available on the OPeNDAP server: https://opendap.4tu.nl/thredds/catalog/data2/pcrglobwb/version_2019_11_beta/pcrglobwb2_input/catalog.html. The OPeNDAP protocol (https://www.opendap.org) allow users to access PCR-GLOBWB input files from the remote server and perform PCR-GLOBWB runs without the need to download the input files (with total size ~250 GB for the global extent).

Some output files are also provided: https://opendap.4tu.nl/thredds/catalog/data2/pcrglobwb/version_2019_11_beta/example_output/global_05min_gmd_paper_output/catalog.html. More output files are available on https://geo.data.uu.nl/research-pcrglobwb/pcr-globwb_gmd_paper_sutanudjaja_et_al_2018/ (for requesting access, please send an e-mail to [email protected]).

How to install

Please follow the following steps required to install PCR-GLOBWB:

  1. You will need a working Python environment, we recommend to install Miniconda, particularly for Python 3. Follow their instructions given at https://docs.conda.io/en/latest/miniconda.html. The user guide and short reference on conda can be found here.

  2. Get the requirement or environment file from this repository conda_env/pcrglobwb_py3.yml and use it to install all modules required (e.g. PCRaster, netCDF4) to run PCR-GLOBWB:

    conda env create --name pcrglobwb_python3 -f pcrglobwb_py3.yml

    This will create a environment named pcrglobwb_python3.

  3. Activate the environment in a command prompt:

    conda activate pcrglobwb_python3

  4. Clone or download this repository. We suggest to use the latest version of the model, which should also be in the default branch.

    git clone https://github.com/UU-Hydro/PCR-GLOBWB_model.git

    This will clone PCR-GLOBWB into the current working directory.

PCR-GLOBWB configuration .ini file

For running PCR-GLOBWB, a configuration .ini file is required. Some configuration .ini file examples are given in the config directory. To be able to run PCR-GLOBWB using these .ini file examples, there are at least two things that must be adjusted.

First, please make sure that you edit or set the outputDir (output directory) to the directory that you have access. You do not need to create this directory manually.

Moreover, please also make sure that the cloneMap file is stored locally in your computing machine. The cloneMap file defines the spatial resolution and extent of your study area and must be in the pcraster format. Some examples are given in this repository clone_landmask_maps/clone_landmask_examples.zip.

By default, the configuration .ini file examples given in the config directory will use PCR-GLOBWB input files from the 4TU.ResearchData server, as set in their inputDir (input directory).

inputDir = https://opendap.4tu.nl/thredds/dodsC/data2/pcrglobwb/version_2019_11_beta/pcrglobwb2_input/

This can be adjusted to any (local) locations, e.g. if you have the input files stored locally in your computing machine.

How to run

Please make sure that the correct conda environment in a command prompt:

conda activate pcrglobwb_python3

Go to to the PCR-GLOBWB model directory. You can start a PCR-GLOBWB run using the following command:

python deterministic_runner.py <ini_configuration_file>

where <ini_configuration_file> is the configuration file of PCR-GLOBWB.

Exercises/cooking recipes

We included some exercise/cooking recipes for running PCR-GLOBWB. You can find these documents in the folder exercise within this repository. While these exercises were generally designed for our own computing facilities (e.g. velocity and eejit servers), they should be adaptable for use on other computing machines.

pcr-globwb_model's People

Contributors

bramdr avatar edwinkost avatar hsutanudjajacchms99 avatar mcflugen avatar nielsdrost avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pcr-globwb_model's Issues

Error when running with dynamicFloodPlain turned off

With the most recent changes on the develop branch, I get an error when trying to run PCR-GLOBWB with dynamicFloodPlain set to False (I'm using the Rhine-Meuse config file in grpc4bmi-examples repository). The error I get when running the deterministic_runner.py script is the following,

Traceback (most recent call last):
...
AttributeError: 'Routing' object has no attribute 'channelStorageCapacity'

As far as I can tell, the fix is simply to wrap the flood-plain specific code in something like: if self.floodPlain - at least that seems to work for my case. I will send a pull request with my fixes to see what you all think.

PCRaster Module not found

When I try to run the following code: python deterministic_runner.py <ini_configuration_file (with absolute file path)>, python seems unable to detect the pcraster module.

I have tried to install pcraster inside the model folder using the following code:
conda install -c conda-forge pcraster

However the issue persists. I would request a help to resolve it.

feature prefactorOptions (e.g. for tuning model parameters)

Description
Feature to use preFactors for tuning/adjusting model parameters.

Problem
There are some developments related to this feature. We should finalize this and make this feature formal.

Implementation
I suggest to merge from the development created in https://github.com/edwinkost/PCR-GLOBWB_model/blob/uly_2024/config/ulysses/version_2023-12-31/setup_6arcmin_ulysses_2LCs_version_2023-12-31_finalize_global_clean.ini

Downsides
None.

Make development branch

It would be good to make a separate "develop" branch (see also https://stackoverflow.com/a/39586780). This branch will accumulate enhancements/features developed for PCR-GLOBWB. The development branch can then be used for testing and added to the master branch when appropriate. Hotfixes will be added directly to the master branch.

include GLOBGM as a module

Description
Include GLOBGM (the MODFLOW groundwater model of PCR-GLOBWB) as a module.

Problem
Although it is known that PCR-GLOBWB can include a MODFLOW groundwater model (e.g. since Sutanudjaja et al. 2011 to Verkaik et al., 2014), we have not really formalized this feature yet in https://github.com/UU-Hydro/PCR-GLOBWB_model.

Implementation
Edwin will try to merge the latest developments in the respect of GLOBGM, more specifically from https://github.com/UU-Hydro/GLOBGM and https://github.com/edwinkost/PCR-GLOBWB_GM-GLOB (e.g. https://github.com/edwinkost/PCR-GLOBWB_GM-GLOB/tree/modularizing)

Downsides
None, I guess.

Get rid of unused module imports

The model code is littered with modules that are imported by not used. This slows down the parsing of the code and adds dependencies that are not needed.

A static analyzer like pyright can be used to detect these. I recommend to at least configure a pre-commit hook to detect them before code gets committed.

PCR-GLOBWB 3 development tracking

This issue will list and track the progress made for PCR-GLOBWB 3. This is a general overview. For each specific item, a separate issue will be made that contains more details.

Enhancements

  • #45
  • PCR-GLOBWB supporting modules refactoring
  • PCR-GLOBWB core modules refactoring
  • PCR-GLOBWB continuous integration

Features

  • Flexible landcover type support
  • Flexible soil layers support
  • Flexible input file read frequency
  • Temporal input/output file naming (e.g. per year or per month)
  • Zarr file format support
  • Endothermic lake water balance
  • Demand-driven reservoir operations
  • Units switch from [m] to [kg m-2]
  • Non-linear baseflow
  • Lateral snow movement

Automatically derive input reading frequency from file

Currently, the frequency at which input parameters are read from file is hard coded in the model. For example the yearly land-surface inputs:

        # - assumption: annual resolution
        if self.noAnnualChangesInLandCoverParameter == False and self.dynamicIrrigationArea == False and \
          (currTimeStep.timeStepPCR == 1 or currTimeStep.doy == 1):
            msg = 'Read land cover fractions based on the given netcdf file.'```

It would be better for this frequency to be derived from the files themselves, meaning that if a monthly file is given, data is read monthly. Moreover, if the final timestep of the input file is reached, the input data of the final timestep is re-read every year. This is not needed.

Allow for input with time-related naming

Would like to see that the input-file string can contain formatting patterns that match the time of the current model timestep. This is especially useful for meteorological forcing inputs that often come in yearly NetCDF files.

A simple key-based format would be sufficient, and would also allow for user-customized formatting options (such as the number of digits). As an example:
netcdf_file_name = self.preFileNC.format(year = currTimeStep.year, month = currTimeStep.month, day = currTimeStep.day, hour = currTimeStep.hour, second = currTimeStep.second, milisecond = currTimeStep.milisecond)

open https://geo.data.uu.nl/research-pcrglobwb/pcr-globwb_gmd_paper_sutanudjaja_et_al_2018

Description
Make the output files that are available on https://geo.data.uu.nl/research-pcrglobwb/pcr-globwb_gmd_paper_sutanudjaja_et_al_2018/ COMPLETETLY OPEN (without having for requesting access to E.H.Sutanudjaja).

Problem
Make the output files that are available on https://geo.data.uu.nl/research-pcrglobwb/pcr-globwb_gmd_paper_sutanudjaja_et_al_2018/ COMPLETETLY OPEN (without having for requesting access to E.H.Sutanudjaja).

Implementation

  • Work with the Yoda team.
  • Rearrange the files, and push them to the vault.
  • Create a readme.
  • Publish.

Downsides
None

Refactoring guidelines

Guidelines for refactoring PCR-GLOBWB.

Planning

  • Create design diagrams before refactoring
    • Should include classes and their purpose
    • Should include interactions between classes through (public) function calls
  • Create a GitHub issue for the specific refactor (see the contributing document)

Structure

  • Modules and classes should be designed to be closed for modification but open for extension
  • Modules and classes should be independent (i.e. avoid interactions between classes)
  • Avoid methods not in a class (i.e. static methods)
  • Class methods should be small, with a maximum of two to three indentations, and have a specific purpose
  • Class variables are allocated during __init__()
  • Avoid class variable modification in deeper sub-functions

Styling

  • Code styling mainly follows the Python PEP8 specifications
    • Classes use TitleCase whereas properties/methods (and others) use snake_case
    • A leading underscore is used for private properties/methods
    • Avoid try-except block. If necessary, specify the exception expected
    • Prefer with block instead of calling 'close()'
    • Annotate all functions with the expected types (type-hinting)
    • Place (independent) classes and functions in separate files
  • Before publishing a refactor, format your code using isort (for imports) and black (for code).

Allow for temporal output writing

Especially long higher-resolution simulations, storing all timesteps (for a single variable) into a single output file results in very large output files. It would be nice if an output frequency could be specified under the reportingOptions in the configuration file. For example:

[reportingOptions]
formatNetCDF = NETCDF4
frequency = year
zlib = True

(some) files not being read from openDAP

Hi, I am Bart, RSE at the eScience Center. I am currently working on getting the newest versions of PCR-GLOB to work with eWaterCycle.

For this I have created a Docker container which packages this model with grpc4bmi. You can view that here (as a text file): Dockerfile.txt.

Things seem to mostly work, until I am trying to initialize the model. Note that I am using version 2.4.0, with the setup_30min.ini config file.

After 25 seconds of initialization (and network activity indicating that files are downloaded using opendap), the model fails with the following error:

    No such file or directory: '/home/bart/ewatercycle/output/pcrglobwb_20231206_095919/global_30min/initialConditions/non-natural/consistent_run_201903XX/1999/interceptStor_forest_1999-12-31.nc'"

Which is strange, because the global_30min/... part is supposed to come from the input directory, and not the output directory right?

The global options on the ini are as following:

outputDir = /home/bart/ewatercycle/output/pcrglobwb_20231206_095919
cloneMap = /home/bart/ewatercycle/parameter-sets/pcrglobwb_rhinemeuse_30min/cloneMaps/RhineMeuse30min.map
inputDir = https://opendap.4tu.nl/thredds/dodsC/data2/pcrglobwb/version_2019_11_beta/pcrglobwb2_input/

Unaligned input regridding to clone mask

When regridding a course resolution (netCDF) array, a factor is applied and the array is uniformly downscaled based on this factor. However, for cases where the fine resolution bounds to not match up with the coarse resolution bounds, this will not work. Rather the regridded coarse resolution array will be too large.

At minimum, the model should stop running and give a clear error when this may occur. Better would be to incorporate regridding for unaligned grids.

how to open and convert this Pcraster format

I hope to read the map file of the 5arcmin model, and I tried to install the Pcraster, but I don’t know how to open and convert this file format (to asc or nc, etc.) . I would be very grateful if you could give me some information about it.

A question about the principle of routing module

Dear author,
I found that the channelStorage of each grid cell is directly contributed by the total runoff. Does this mean that the presence of a river in each grid cell is not important? Because I did not see any code identifying whether there is a river in the grid cell.
I think that means the routing module doesn't include slope routing concentration calculations,is it reasonable?

Continue previously unfinished runs

When running long high-resolution PCR-GLOBWB simulations, it is often the case the the simulation time exceeds the time allowed by the slurm manager on external servers. Or, it may be that there is some other error that causes (longer) simulations to stop. It would be nice if we could continue previously unfinished runs, using the same output directory.

Currently, PCR-GLOBWB appends its outputs to the output NetCDFs per timestep and stores its states to the state pcraster maps per year. An argument should be added to the PCR-GLOBWB runner (e.g. --continue), where the model does not clean up the output directory. Rather it checks:

  1. If the output files already exist
  2. If the state files already exist
  3. The latest available state file

And than continues the previous run from the latest available state file, appending to the output files when necessary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.