jgcri / xanthos Goto Github PK

View Code? Open in Web Editor NEW

32.0 32.0 16.0 258.54 MB

An extensible global hydrologic framework

License: Other

Python 86.90% Jupyter Notebook 13.10%

climate hydrology simulation

xanthos's People

Contributors

Stargazers

Watchers

Forkers

wk1984 muguangyuze calebbraun c8zhuang gutabeshu lihao-cau 0wenwu rplzzz jiachenwang1995 fasikaw mengqi-z rain3498 fkanyako thurber lidh966

xanthos's Issues

example.py run error.

Hi, I am trying to run the example file but I keep running into an error after "Start Hydropower Actual". I have tried changing my pandas version but I am getting the same error. Here is the error

File "pandas_libs\groupby.pyx", line 692, in pandas._libs.groupby.group_quantile
TypeError: must be real number, not tuple

Improvement: Remove GitLFS Dependency

Users have had issues with the GitLFS dependency and unreliability across platforms. To remedy this, I am proposing utilizing Zenodo's REST API for the example data that usually accompanies a release. This can be accessed by a get_example_data call if the user wishes to install it. Another advantage to this approach is that it will give advanced users the ability to opt out of carrying the example data overhead.

Add expected outputs

Add expected outputs from the preconfigured run.

question for Set up configuration file

What are you trying to add (e.g., PET, runoff, post-processing module)?

What are the input/output units of your module?

What is the timestep of you module?

Are you addressing a specific science question with your extension?

Is your issue related to integration, an error, or other? Please describe.

Installing Xanthos

If not already installed, install anaconda/2 before running install setup.py. Also add the flag --user to the install setup.py command to avoid permissions errors if running on a HPC system.

GitHub actions CI

Add GitHub actions CI in to replace Travis-CI.

Create reader for 3D climate data

Create a reader to ingest standard climate data formats in 3D [day, latitude, longitude] to the format and units required by Xanthos [land_cell, month].

Rogue print statement in ini_reader.py

I found this while tracking down something for Tom:

xanthos/xanthos/data_reader/ini_reader.py

Line 606 in 754f111

print('Warning: {} is not a valid parameter'.format(k))

I believe the best practice would be to change print(...) to logging.warning(...)

Also, I grepped the rest of the source for instances of print, and this is the only one.

Add proper logger

Set up a proper logger that closes handlers on model completion.

basin name error

In the reference data, the file BasinNames235.txt assigns the name "Rhine" to both basin ID 32 and basin ID 62. Basin ID 62 should be named "Rhone".

The mix up also appears to affect the basin names in the output csv files (e.g., Basin_runoff_235_XXXX.csv).

Output results in memory

Modify the model to return the results in-memory.

Xanthos raises error when attempting to write outputs in .netcdf format

Hello developers, I am new to Xanthos.

I have installed the package with no issues following the instructions provided. When I ran the example the first time, I kept the default assumptions then I got all outputs in .csv files. However, when I chose outputs to be in .netcdf format I had the following error (I had also some warnings prior to this error):

INFO: PET processed in 328.2244460582733 seconds---
INFO: Processing Runoff...
INFO: Processing spin-up and simulation for basins 1...235
INFO: Runoff processed in 13.122142791748047 seconds---
INFO: Processing Routing...
INFO: Routing processed in 447.40205478668213 seconds---
INFO: ---Simulation has finished successfully: 788.7534136772156 seconds ---
INFO: ---Start Accessible Water:
INFO: ---Accessible Water has finished successfully: 5.011179447174072 seconds ------
INFO: ---Start Drought Statistics:
INFO: Calculating drought thresholds
INFO: ---Drought Statistics has finished successfully: 0.6707887649536133 seconds ------
INFO: ---Start Hydropower Potential:
C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\hydropower\potential.py:36: FutureWarning: Support for multi-dimensional indexing (e.g. obj[:, None]) is deprecated and will be removed in a future version. Convert to a numpy array before indexing instead.
e_grids = q_ex_h * hyd_grid_data["elevD"][:, np.newaxis]
INFO: ---Hydropower Potential has finished successfully: 8.252254486083984 seconds ------
INFO: ---Start Hydropower Actual:
INFO: ---Hydropower Actual has finished successfully: 62.43630385398865 seconds ------
INFO: ---Start Diagnostics:
INFO: ---Diagnostics has finished successfully: 0.17566919326782227 seconds ------
INFO: ---Output simulation results:
DEBUG: Outputting data annually
DEBUG: Unit is km3peryear
C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\pandas\core\indexing.py:719: FutureWarning: Slicing a positional slice with .loc is not supported, and will raise TypeError in a future version. Use .loc with labels or .iloc with positions instead.
indexer = self._get_setitem_indexer(key)
DEBUG: pet output dimension is (67420, 31)
Traceback (most recent call last):
File "C:\Users\atmos\Documents\xanthos\install_examples.py", line 25, in
xanthos.run_model(config_file)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\model.py", line 121, in run_model
xth.execute()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\model.py", line 94, in execute
results = config_runner.run()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\configurations.py", line 136, in run
c.output_simulation()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\components.py", line 459, in output_simulation
output_writer.write()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\data_writer\out_writer.py", line 125, in write
self.write_data(filename, var, self.outputs[i], col_names=self.time_steps)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\data_writer\out_writer.py", line 163, in write_data
self.save_netcdf(filename, data, var)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\data_writer\out_writer.py", line 220, in save_netcdf
griddata[:, :] = data[:, :].copy()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\pandas\core\frame.py", line 3455, in getitem
indexer = self.columns.get_loc(key)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 82, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(slice(None, None, None), slice(None, None, None))' is an invalid key

Thanks for any advice.

Pandas .ix depreciation warning

The hydropower module gives:

DeprecationWarning: .ix is deprecated.

This is an old style of pandas indexing and should be updated.

Questions regarding the configuration settings of pm_abcd_mrtm.ini

The example model data is for a period of 31 years, but I want to run it using only ten years of data. However, I noticed that some values in the output directory are not on the same scale as the 31-year data, such as GCAMRegion_runoff_km3peryear_pm_abcd_mrtm_watch.csv. How should the configuration file be set? When I run it with the first ten years of the example data, I encounter the same issue, but running it with the full 31 years of data gives correct results。
I only modified the input files in the 'pet' and 'climate' folders, which are in the .npy format. I haven't made any changes to other input files. The dimensions used are 67420*120, representing ten years of data.
Do I need to modify anything else? Thank you！
Below is the content of my configuration information file.
[Project]

ProjectName = pm_abcd_mrtm_watch_1971_1980

RootDir = d:\work\xanthos-main1\example

InputFolder = input

OutputFolder = output

RefDir = reference
pet_dir = pet

RoutingDir = routing

RunoffDir = runoff

DiagDir = diagnostics

AccWatDir = accessible_water

HydActDir = hydropower_actual

HistFlag = True

n_basins = 235

StartYear = 1971
EndYear = 1980

output_vars = pet, aet, q, soilmoisture, avgchflow

OutputFormat = 1

OutputUnit = 1

OutputInYear = 1

AggregateRunoffBasin = 1
AggregateRunoffCountry = 1
AggregateRunoffGCAMRegion = 1

PerformDiagnostics = 1

CreateTimeSeriesPlot = 0

CalculateDroughtStats = 1

CalculateAccessibleWater = 1

CalculateHydropowerPotential = 1

CalculateHydropowerActual = 1

Calibrate = 0

[PET]

pet_module = pm

[[penman-monteith]]

pet_dir = penman_monteith

pm_tas = tas_watch_monthly_degc_1971_1980.npy

pm_tmin = tasmin_watch_monthly_degc_1971_1980.npy

pm_rhs = rhs_watch_monthly_percent_1971_1980.npy

pm_rlds = rlds_watch_monthly_wperm2_1971_1980.npy

pm_rsds = rsds_watch_monthly_wperm2_1971_1980.npy

pm_wind = wind_watch_monthly_mpers_1971_1980.npy

pm_lct = lucc1901_2010_lump.npy

pm_nlcs = 8

pm_water_idx = 0

pm_snow_idx = 6

pm_lc_years = 1900, 1910, 1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2005, 2010

[Runoff]

runoff_module = abcd

[[abcd]]

runoff_dir = abcd

calib_file = pars_watch_1971_1990_decadal_lc.npy

runoff_spinup = 120

jobs = -1

TempMinFile = example/input/climate/pr_gpcc_watch_monthly_mmpermth_1971_1980.npy

PrecipitationFile =example/input/climate/pr_gpcc_watch_monthly_mmpermth_1971_1980.npy

[Routing]

routing_module = mrtm

[[mrtm]]

routing_dir = mrtm

routing_spinup = 120

channel_velocity = velocity_half_degree.npy

flow_distance = DRT_half_FDISTANCE_globe.txt

flow_direction = DRT_half_FDR_globe_bystr50.txt

[Diagnostics]

VICDataFile = vic_watch_hist_nosoc_co2_qtot_global_annual_1971_2000.nc
WBMDataFile = wbm_qestimates.csv
WBMCDataFile = wbmc_qestimates.csv
UNHDataFile = UNH_GRDC_average_annual_1986_1995.nc

Scale = 0
[Drought]

threshold_start_year = 1971
threshold_end_year = 1980

threshold_nper = 12
[AccessibleWater]

ResCapacityFile = total_reservoir_storage_capacity_BM3.csv
BfiFile = bfi_per_basin.csv

HistEndYear = 1980
GCAM_StartYear = 1971
GCAM_EndYear = 1980
GCAM_YearStep = 1

MovingMeanWindow = 9

Env_FlowPercent = 0.1
[HydropowerPotential]

hpot_start_date = "1/1971"

q_ex = 0.7

ef = 0.8
[HydropowerActual]

hact_start_date = "1/1971"
[TimeSeriesPlot]
Scale = 1
MapID = 999

Can anyone give me some advice about how to run MRTM module only ?

Now I already have the runoff data from a rainfall-runoff monthly model,and I want to use the routing module to help my research. It will be greatful if anyone can give me some advice. Many thanks again.
What's more , I want to know if I can use other routing datasets about flowdirection and velocity besides MRTM which is based DRT routing scheme.

how to display the result of .nc file or csv file

this model works, but i don't know how to display one month result,

Line 208 in abcd.py gives NAN values

Reported by Guta W Abeshu in email to Mohamad on Saturday, August 15, 2020:

"I found a numerical issue that results in NAN in estimated runoff magnitudes in the recent version of Xanthos. This issue is only noticeable beyond 1985 in many basins.

Below here is the equation used for ET opportunity computation (Line 208, on the GitHub file abcd.py). The NAN values were detected when the term under the square root is < 0. In terms of magnitude, the values under square root during these times were found to be in the order of 1e-10. So, merely rounding the decimal places reduces them to zero and solves the problem. It is a small issue, but the NAN values generated here propagates to flow routing, causing additional problems. "

git lfs clone fix

git lfs install
git clone https://github.com/JGCRI/xanthos.git

When attempting the above from a windows shell, the download hangs.

Solution from here is to use git lfs clone https://github.com/JGCRI/xanthos.git, which works. This should maybe be added to point 2 of getting started instructions.

Remove bare excepts

There are several bare except: statements. Catch a specific exception instead. (See https://www.python.org/dev/peps/pep-0008/#programming-recommendations)

Add option to toggle on/off for standard outputs

Add an option in the Xanthos configuration file and code that allows for only certain outputs to be written to disk

What does HistFlag do?

It is unclear what the intent is of the HistFlag configuration variable. The wiki says

If True, channel storage and soil moisture files are saved; if False, channel storage and soil moisture files are loaded when using the GWAM runoff module.

This isn't exactly correct, as the MRTM routing module also uses this flag to check whether to load channel storage values.

The example configuration file has

# HistFlag = True, historic mode ; = False, future mode
HistFlag = True

but it isn't clear what is meant by historic mode or future mode.

This is also the only place where the truthy value is "True"/"False" (although "true" is allowed in some cases, and "t", "y", "yes", and "1" are even allowed in one case). The preferred alternative would be 0/1 like the other flag options.

Documentation and implementation of `runoff_file` and `alt_runoff`

Hello,

A postdoc user of Xanthos pointed out this issue to me. They were trying to reproduce the example run but with user-provided PET and RUNOFF files, but a few things conspire to prevent this:

The documentation here describes a runoff_file setting under the [Runoff] configuration tag. As far as I can see in the code, this setting is totally ignored. Instead, one needs to use the [Routing] [[mrtm]] alt_runoff setting.
Even so, if pet_module = none and runoff_module = none, neither the pet_file nor the alt_runoff file get passed on to the routing module (or the result files).

I will submit a Pull Request that can fix this issue for this particular use case, but I'm not as familiar with the broader context of the model so you may wish to address in a different way!

output csvs unit column

Currently the output csvs (e.g. Basin_runoff_235_XXXX.csv) have a final column header that specifies units (e.g., Unit (km^3/year). This header doesn't actually refer to a column of data in the table, making the csv difficult to work with. I think the unit needs to be either:
(1) included as a full column,
(2) included as a comment at the top of the file (best option in my opinion), or
(3) included in the file name.

Read first command-line argument as config file

Running Xanthos from the command line requires the first (only) argument to be the configuration file:

python xanthos/model.py example/pm_abcd_mrtm.ini

This errors because the flag -config_file was not specified. We should assume any extra argument is the -config_file parameter.

Question about output units (mm/year, km3/year)

Dear all, I've been working with xanthos for the past month and I'm having trouble to understand the conversion between mm/year to km3/year in the output files. Running the example file (https://zenodo.org/record/2578287) both in km3peryear and mmperyear outputs options gives differentes results. For example: for Arctic Ocean Islands basin, in the year 1971, the run with km3 gives a runoff of 200.778857, and the run with mm gives 224395.703522.

If I understand correctly, 1 mm of water corresponds to 1 kg of water (in volume terms: 1mm x 1m x 1m), hence to convert it to km3 I would have to divide the mm quantity by 1e12. As it don't result in the same number, I divided the km3 column with the mm column in my dataset to see what is the ratio between them.

This gives a ratio between 0.001 and 0.003, it varies between basins and is correlated with the basin id. Hence, I thought that the conversion happens taking into account the area of the basin, but then mmperyear should not correspond to the total but the mean or something like that. However, the ratio varies slightly between years also, (for example: for Arctic Ocean Islands between 1971-2001 it varies between 0.001006 and 0.001143).

Would appreciate so much if someone can clarify this.
Thanks,

Diagnostics Future Warning

Hi! I've been using xanthos this past month and the Diagnostics Module print the following future warning:

INFO: ---Start Diagnostics:
/home/rcalvo/miniconda3/envs/xanthos/lib/python3.7/site-packages/xanthos/diagnostics/diagnostics.py:130: FutureWarning: Slicing a positional slice with .loc is not supported, and will raise TypeError in a future version. Use .loc with labels or .iloc with positions instead.
agg_df.loc[-1, 1:] = agg_df.sum(numeric_only=True)

Just a heads up.

Thanks.

Why can't I find example data

Basin Name Typo

There is a basin in the BasinNames235.txt file called "South America Colorad". I think this should be "Colorado".

Trouble to generate nc file

Hi,
When i select csv and matlab format for the outputs, all is ok, but are problems with Netcdf option. I get this message in the last line of proccessing of the example data.

TypeError: '(slice(None, None, None), slice(None, None, None))' is an invalid key

How can i solve this issue ??
Thanks
Miguel

Individual modules (e.g. ABCD, MRTM...)
Model extensions (hydropower, drought indices)
All utils functions
Data reading and writing

Refactor how configurations work

The configurations.py file has a lot of repetition, and adding a new configuration requires duplicating a lot of this code. We can think of a better system for this.