Giter Club home page Giter Club logo

hydrodataset's Introduction

HydroDataset

Dataset downloader and processor for Watershed Hydrologic Modeling

Note: this repository is still developing!

Datasets zoo list

  • CAMELS/MOPEX/LAMAH
  • Daymet
  • ECMWF ERA5-Land
  • GHS
  • MODIS ET
  • NEX-GDDP-CMIP5/6
  • NLDAS

More details are shown in the following sections.

CAMELS/MOPEX/LAMAH

The CAMELS series data include:

We also support CANOPEX (Canada's MOPEX dataset) and LamaH-CE (similar with CAMELS and it is for Central Europe), because we use these datasets just like CAMELS, we write similar code in data/data_camels.py.

If you can read Chinese, this blog may be a quick start for CAMELS (CAMELS-US) and this for other CAMELS datasets.

Download CAMELS/MOPEX/LAMAH datasets

We recommend downloading the datasets manually, the downloading address are as follows:

You can also use the following code to download CAMELS-US (notice: the unzipped file is 10+ GB):

import os
import definitions
from hydrodataset.data.data_camels import Camels

# DATASET_DIR is defined in the definitions.py file
camels_path = os.path.join(definitions.DATASET_DIR, "camels", "camels_us")
camels = Camels(camels_path, download=True)

For CAMELS_YR, it is enough to download 9_Normal_Camels_YR.zip

To download CANOPEX, you have to deal with the GFW. In addtion, there is no attributes data in CANOPEX, we choose an alternative: attributes data from HYSETS

After downloading, puteach dataset in one directory, the following file-organization is recommended:

camels
│
└── camels_aus
    └── 01_id_name_metadata.zip
    └── 02_location_boundary_area.zip
    └── ...
└── camels_br
    └── ...
└── camels_cl
    └── ...
└── camels_gb
    └── ... 
└── camels_us
    └── ... 
└── camels_yr
    └── ... 
canopex
    └── Boundaries.zip
    └── HYSETS_watershed_properties.txt
    └── ...   
lamah_ce
    └── 2_LamaH-CE_daily.tar.gz
    └── ...   

Process datasets

All methods for processing CAMELS datasets are written in Camels class in hydrobench/data/data_camels.py.

Daymet

We download and process Daymet data for the 671 basins in CAMELS(-US).

Download Daymet V4 dataset for basins in CAMELS

Use hydrobench/app/download/download_daymet_camels_basin.py to download daymet grid data for the boundaries of basins in CAMELS.

Process the raw Daymet V4 data

We provided some scripts to process the Daymet grid data for basins:

  • Regrid the raw data to the required resolutions (hydrobench/app/daymet4basins/regrid_daymet_nc.py)
  • calculate_basin_mean_forcing_include_pet.py and calculate_basin_mean_values.pyin hydrobench/app/daymet4basins can be used for getting basin mean values
  • If you want to get P (precipitation), PE (potential evapotranspiration), Q (streamflow) and Basin areas, please use hydrobench/app/daymet4basins/pbm_p_pe_q_basin_area.py

ECMWF ERA5-Land

Download ERA5-Land data

Although we provide tools to use cds toolbox from ECMWF to retrieve ERA5-land data, it seems it didn't work well (even when data is MB level). Hence, we recommend a manual way to download the ERA5-land data archive from https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land?tab=form

Process the downloaded ERA5-Land data

TODO: Regrid the raw data to the required resolutions (src/regrid.py from https://github.com/pangeo-data/WeatherBench)

GHS

The dataset's full name is "Geospatial attributes and Hydrometeorological forcings of gages for Streamflow modeling".

"GHS" is an extension for the CAMELS dataset. It contains geospatial attributes, hydrometeorological forcings and streamflow data of 9067 gages over the Contiguous United States (CONUS) in the GAGES-II dataset.

Now we have not provided an online way to download the data. You can refer to the following paper to learn about how to get it.

Wenyu Ouyang, Kathryn Lawson, Dapeng Feng, Lei Ye, Chi Zhang, & Chaopeng Shen (2021). Continental-scale streamflow modeling of basins with reservoirs: Towards a coherent deep-learning-based strategy. https://doi.org/10.1016/j.jhydrol.2021.126455

MODIS ET

Download basin mean ET data from GEE

We provided Google Earth Engine scripts to download the PML V2 and MODIS MOD16A2_105 product for given basins:

TODO: provide a link -- Download basin mean values of ET data

Process ET data to CAMELS format

Use hydrobench\app\modis4basins\trans_modis_et_to_camels_format.py to process the downloaded ET data from GEE to the format of forcing data in CAMELS

NEX-GDDP-CMIP5/6

Download

NEX-GDDP-CMIP5 data for basins could be downloaded from Google Earth Engine. The code is here

For NEX-GDDP-CMIP6, data should be downloaded from this website

Process

Use hydrodataset/app/climateproj4basins/trans_nexdcp30_to_camels_format.py to process NEX-GDDP-CMIP5 data for basins

We will provide tool for NEX-GDDP-CMIP6 data soon

NLDAS

Download basin mean NLDAS data from GEE

The GEE script is here

Download NLDAS grid data from NASA Earth data

Use hydrobench/app/download/download_nldas_hourly.py to download them.

Notice: you should finish some necessary steps (see the comments in hydrobench/nldas4basins/download_nldas.py) before using the script

Process NLDAS basin mean forcing

Use hydrobench/app/nldas4basins/trans_nldas_to_camels_format.py to transform the data to the format of forcing data in CAMELS.

TODO: more processing scripts are needed for NLDAS grid data.

How to run the code

Use environment.yml to create conda environment:

conda env create -f environment.yml
conda activate HydroDataset

Then, you can try python script in "app" directory

Acknowledgement

hydrodataset's People

Contributors

ouyangwenyu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.