Dataset downloader and processor for Watershed Hydrologic Modeling
Note: this repository is still developing!
- CAMELS/MOPEX/LAMAH
- Daymet
- ECMWF ERA5-Land
- GHS
- MODIS ET
- NEX-GDDP-CMIP5/6
- NLDAS
More details are shown in the following sections.
The CAMELS series data include:
- CAMELS-AUS (CAMELS-AUS: Hydrometeorological time series and landscape attributes for 222 catchments in Australia)
- CAMELS-BR (CAMELS-BR: Hydrometeorological time series and landscape attributes for 897 catchments in Brazil - link to files)
- CAMELS-CL (The CAMELS-CL dataset: catchment attributes and meteorology for large sample studies – Chile dataset)
- CAMELS-GB (CAMELS-GB: Hydrometeorological time series and landscape attributes for 671 catchments in Great Britain)
- CAMELS-US (The CAMELS data set: catchment attributes and meteorology for large-sample studies)
- CAMELS-YR (Catchment attributes and meteorology for large sample study in contiguous China)
We also support CANOPEX (Canada's MOPEX dataset) and LamaH-CE (similar with CAMELS and it is for Central Europe), because we use these datasets just like CAMELS, we write similar code in data/data_camels.py.
If you can read Chinese, this blog may be a quick start for CAMELS (CAMELS-US) and this for other CAMELS datasets.
We recommend downloading the datasets manually, the downloading address are as follows:
- Download CAMELS-AUS
- Download CAMELS-BR
- Download CAMELS-CL
- Download CAMELS-GB
- Download CAMELS-US
- Download CAMELS-YR
- Download CANOPEX
- Download LamaH-CE
You can also use the following code to download CAMELS-US (notice: the unzipped file is 10+ GB):
import os
import definitions
from hydrodataset.data.data_camels import Camels
# DATASET_DIR is defined in the definitions.py file
camels_path = os.path.join(definitions.DATASET_DIR, "camels", "camels_us")
camels = Camels(camels_path, download=True)
For CAMELS_YR, it is enough to download 9_Normal_Camels_YR.zip
To download CANOPEX, you have to deal with the GFW. In addtion, there is no attributes data in CANOPEX, we choose an alternative: attributes data from HYSETS
After downloading, puteach dataset in one directory, the following file-organization is recommended:
camels
│
└── camels_aus
└── 01_id_name_metadata.zip
└── 02_location_boundary_area.zip
└── ...
└── camels_br
└── ...
└── camels_cl
└── ...
└── camels_gb
└── ...
└── camels_us
└── ...
└── camels_yr
└── ...
canopex
└── Boundaries.zip
└── HYSETS_watershed_properties.txt
└── ...
lamah_ce
└── 2_LamaH-CE_daily.tar.gz
└── ...
All methods for processing CAMELS datasets are written in Camels class in hydrobench/data/data_camels.py.
We download and process Daymet data for the 671 basins in CAMELS(-US).
Use hydrobench/app/download/download_daymet_camels_basin.py to download daymet grid data for the boundaries of basins in CAMELS.
We provided some scripts to process the Daymet grid data for basins:
- Regrid the raw data to the required resolutions (hydrobench/app/daymet4basins/regrid_daymet_nc.py)
- calculate_basin_mean_forcing_include_pet.py and calculate_basin_mean_values.pyin hydrobench/app/daymet4basins can be used for getting basin mean values
- If you want to get P (precipitation), PE (potential evapotranspiration), Q (streamflow) and Basin areas, please use hydrobench/app/daymet4basins/pbm_p_pe_q_basin_area.py
Although we provide tools to use cds toolbox from ECMWF to retrieve ERA5-land data, it seems it didn't work well (even when data is MB level). Hence, we recommend a manual way to download the ERA5-land data archive from https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land?tab=form
TODO: Regrid the raw data to the required resolutions (src/regrid.py from https://github.com/pangeo-data/WeatherBench)
The dataset's full name is "Geospatial attributes and Hydrometeorological forcings of gages for Streamflow modeling".
"GHS" is an extension for the CAMELS dataset. It contains geospatial attributes, hydrometeorological forcings and streamflow data of 9067 gages over the Contiguous United States (CONUS) in the GAGES-II dataset.
Now we have not provided an online way to download the data. You can refer to the following paper to learn about how to get it.
Wenyu Ouyang, Kathryn Lawson, Dapeng Feng, Lei Ye, Chi Zhang, & Chaopeng Shen (2021). Continental-scale streamflow modeling of basins with reservoirs: Towards a coherent deep-learning-based strategy. https://doi.org/10.1016/j.jhydrol.2021.126455
We provided Google Earth Engine scripts to download the PML V2 and MODIS MOD16A2_105 product for given basins:
TODO: provide a link -- Download basin mean values of ET data
Use hydrobench\app\modis4basins\trans_modis_et_to_camels_format.py to process the downloaded ET data from GEE to the format of forcing data in CAMELS
NEX-GDDP-CMIP5 data for basins could be downloaded from Google Earth Engine. The code is here
For NEX-GDDP-CMIP6, data should be downloaded from this website
Use hydrodataset/app/climateproj4basins/trans_nexdcp30_to_camels_format.py to process NEX-GDDP-CMIP5 data for basins
We will provide tool for NEX-GDDP-CMIP6 data soon
The GEE script is here
Use hydrobench/app/download/download_nldas_hourly.py to download them.
Notice: you should finish some necessary steps (see the comments in hydrobench/nldas4basins/download_nldas.py) before using the script
Use hydrobench/app/nldas4basins/trans_nldas_to_camels_format.py to transform the data to the format of forcing data in CAMELS.
TODO: more processing scripts are needed for NLDAS grid data.
Use environment.yml to create conda environment:
conda env create -f environment.yml
conda activate HydroDataset
Then, you can try python script in "app" directory