jgcri / xanthos Goto Github PK
View Code? Open in Web Editor NEWAn extensible global hydrologic framework
License: Other
An extensible global hydrologic framework
License: Other
Hi, I am trying to run the example file but I keep running into an error after "Start Hydropower Actual". I have tried changing my pandas version but I am getting the same error. Here is the error
File "pandas_libs\groupby.pyx", line 692, in pandas._libs.groupby.group_quantile
TypeError: must be real number, not tuple
Users have had issues with the GitLFS dependency and unreliability across platforms. To remedy this, I am proposing utilizing Zenodo's REST API for the example data that usually accompanies a release. This can be accessed by a get_example_data
call if the user wishes to install it. Another advantage to this approach is that it will give advanced users the ability to opt out of carrying the example data overhead.
Add expected outputs from the preconfigured run.
If not already installed, install anaconda/2
before running install setup.py
. Also add the flag --user
to the install setup.py
command to avoid permissions errors if running on a HPC system.
Add GitHub actions CI in to replace Travis-CI.
Create a reader to ingest standard climate data formats in 3D [day, latitude, longitude] to the format and units required by Xanthos [land_cell, month].
I found this while tracking down something for Tom:
xanthos/xanthos/data_reader/ini_reader.py
Line 606 in 754f111
I believe the best practice would be to change print(...)
to logging.warning(...)
Also, I grepped the rest of the source for instances of print
, and this is the only one.
Set up a proper logger that closes handlers on model completion.
In the reference data, the file BasinNames235.txt
assigns the name "Rhine" to both basin ID 32 and basin ID 62. Basin ID 62 should be named "Rhone".
The mix up also appears to affect the basin names in the output csv files (e.g., Basin_runoff_235_XXXX.csv
).
Modify the model to return the results in-memory.
Hello developers, I am new to Xanthos.
I have installed the package with no issues following the instructions provided. When I ran the example the first time, I kept the default assumptions then I got all outputs in .csv files. However, when I chose outputs to be in .netcdf format I had the following error (I had also some warnings prior to this error):
INFO: PET processed in 328.2244460582733 seconds---
INFO: Processing Runoff...
INFO: Processing spin-up and simulation for basins 1...235
INFO: Runoff processed in 13.122142791748047 seconds---
INFO: Processing Routing...
INFO: Routing processed in 447.40205478668213 seconds---
INFO: ---Simulation has finished successfully: 788.7534136772156 seconds ---
INFO: ---Start Accessible Water:
INFO: ---Accessible Water has finished successfully: 5.011179447174072 seconds ------
INFO: ---Start Drought Statistics:
INFO: Calculating drought thresholds
INFO: ---Drought Statistics has finished successfully: 0.6707887649536133 seconds ------
INFO: ---Start Hydropower Potential:
C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\hydropower\potential.py:36: FutureWarning: Support for multi-dimensional indexing (e.g. obj[:, None]
) is deprecated and will be removed in a future version. Convert to a numpy array before indexing instead.
e_grids = q_ex_h * hyd_grid_data["elevD"][:, np.newaxis]
INFO: ---Hydropower Potential has finished successfully: 8.252254486083984 seconds ------
INFO: ---Start Hydropower Actual:
INFO: ---Hydropower Actual has finished successfully: 62.43630385398865 seconds ------
INFO: ---Start Diagnostics:
INFO: ---Diagnostics has finished successfully: 0.17566919326782227 seconds ------
INFO: ---Output simulation results:
DEBUG: Outputting data annually
DEBUG: Unit is km3peryear
C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\pandas\core\indexing.py:719: FutureWarning: Slicing a positional slice with .loc is not supported, and will raise TypeError in a future version. Use .loc with labels or .iloc with positions instead.
indexer = self._get_setitem_indexer(key)
DEBUG: pet output dimension is (67420, 31)
Traceback (most recent call last):
File "C:\Users\atmos\Documents\xanthos\install_examples.py", line 25, in
xanthos.run_model(config_file)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\model.py", line 121, in run_model
xth.execute()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\model.py", line 94, in execute
results = config_runner.run()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\configurations.py", line 136, in run
c.output_simulation()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\components.py", line 459, in output_simulation
output_writer.write()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\data_writer\out_writer.py", line 125, in write
self.write_data(filename, var, self.outputs[i], col_names=self.time_steps)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\data_writer\out_writer.py", line 163, in write_data
self.save_netcdf(filename, data, var)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\xanthos\data_writer\out_writer.py", line 220, in save_netcdf
griddata[:, :] = data[:, :].copy()
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\pandas\core\frame.py", line 3455, in getitem
indexer = self.columns.get_loc(key)
File "C:\Users\atmos\Anaconda3\envs\earth-analytics-python\lib\site-packages\pandas\core\indexes\base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas_libs\index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 82, in pandas._libs.index.IndexEngine.get_loc
TypeError: '(slice(None, None, None), slice(None, None, None))' is an invalid key
Thanks for any advice.
The hydropower module gives:
DeprecationWarning: .ix is deprecated.
This is an old style of pandas
indexing and should be updated.
The example model data is for a period of 31 years, but I want to run it using only ten years of data. However, I noticed that some values in the output directory are not on the same scale as the 31-year data, such as GCAMRegion_runoff_km3peryear_pm_abcd_mrtm_watch.csv. How should the configuration file be set? When I run it with the first ten years of the example data, I encounter the same issue, but running it with the full 31 years of data gives correct results。
I only modified the input files in the 'pet' and 'climate' folders, which are in the .npy format. I haven't made any changes to other input files. The dimensions used are 67420*120, representing ten years of data.
Do I need to modify anything else? Thank you!
Below is the content of my configuration information file.
[Project]
ProjectName = pm_abcd_mrtm_watch_1971_1980
RootDir = d:\work\xanthos-main1\example
InputFolder = input
OutputFolder = output
RefDir = reference
pet_dir = pet
RoutingDir = routing
RunoffDir = runoff
DiagDir = diagnostics
AccWatDir = accessible_water
HydActDir = hydropower_actual
HistFlag = True
n_basins = 235
StartYear = 1971
EndYear = 1980
output_vars = pet, aet, q, soilmoisture, avgchflow
OutputFormat = 1
OutputUnit = 1
OutputInYear = 1
AggregateRunoffBasin = 1
AggregateRunoffCountry = 1
AggregateRunoffGCAMRegion = 1
PerformDiagnostics = 1
CreateTimeSeriesPlot = 0
CalculateDroughtStats = 1
CalculateAccessibleWater = 1
CalculateHydropowerPotential = 1
CalculateHydropowerActual = 1
Calibrate = 0
[PET]
pet_module = pm
[[penman-monteith]]
pet_dir = penman_monteith
pm_tas = tas_watch_monthly_degc_1971_1980.npy
pm_tmin = tasmin_watch_monthly_degc_1971_1980.npy
pm_rhs = rhs_watch_monthly_percent_1971_1980.npy
pm_rlds = rlds_watch_monthly_wperm2_1971_1980.npy
pm_rsds = rsds_watch_monthly_wperm2_1971_1980.npy
pm_wind = wind_watch_monthly_mpers_1971_1980.npy
pm_lct = lucc1901_2010_lump.npy
pm_nlcs = 8
pm_water_idx = 0
pm_snow_idx = 6
pm_lc_years = 1900, 1910, 1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2005, 2010
[Runoff]
runoff_module = abcd
[[abcd]]
runoff_dir = abcd
calib_file = pars_watch_1971_1990_decadal_lc.npy
runoff_spinup = 120
jobs = -1
TempMinFile = example/input/climate/pr_gpcc_watch_monthly_mmpermth_1971_1980.npy
PrecipitationFile =example/input/climate/pr_gpcc_watch_monthly_mmpermth_1971_1980.npy
[Routing]
routing_module = mrtm
[[mrtm]]
routing_dir = mrtm
routing_spinup = 120
channel_velocity = velocity_half_degree.npy
flow_distance = DRT_half_FDISTANCE_globe.txt
flow_direction = DRT_half_FDR_globe_bystr50.txt
[Diagnostics]
VICDataFile = vic_watch_hist_nosoc_co2_qtot_global_annual_1971_2000.nc
WBMDataFile = wbm_qestimates.csv
WBMCDataFile = wbmc_qestimates.csv
UNHDataFile = UNH_GRDC_average_annual_1986_1995.nc
Scale = 0
[Drought]
threshold_start_year = 1971
threshold_end_year = 1980
threshold_nper = 12
[AccessibleWater]
ResCapacityFile = total_reservoir_storage_capacity_BM3.csv
BfiFile = bfi_per_basin.csv
HistEndYear = 1980
GCAM_StartYear = 1971
GCAM_EndYear = 1980
GCAM_YearStep = 1
MovingMeanWindow = 9
Env_FlowPercent = 0.1
[HydropowerPotential]
hpot_start_date = "1/1971"
q_ex = 0.7
ef = 0.8
[HydropowerActual]
hact_start_date = "1/1971"
[TimeSeriesPlot]
Scale = 1
MapID = 999
Now I already have the runoff data from a rainfall-runoff monthly model,and I want to use the routing module to help my research. It will be greatful if anyone can give me some advice. Many thanks again.
What's more , I want to know if I can use other routing datasets about flowdirection and velocity besides MRTM which is based DRT routing scheme.
this model works, but i don't know how to display one month result,
Reported by Guta W Abeshu in email to Mohamad on Saturday, August 15, 2020:
"I found a numerical issue that results in NAN in estimated runoff magnitudes in the recent version of Xanthos. This issue is only noticeable beyond 1985 in many basins.
Below here is the equation used for ET opportunity computation (Line 208, on the GitHub file abcd.py). The NAN values were detected when the term under the square root is < 0. In terms of magnitude, the values under square root during these times were found to be in the order of 1e-10. So, merely rounding the decimal places reduces them to zero and solves the problem. It is a small issue, but the NAN values generated here propagates to flow routing, causing additional problems. "
git lfs install
git clone https://github.com/JGCRI/xanthos.git
When attempting the above from a windows shell, the download hangs.
Solution from here is to use git lfs clone https://github.com/JGCRI/xanthos.git
, which works. This should maybe be added to point 2 of getting started instructions.
There are several bare except:
statements. Catch a specific exception instead. (See https://www.python.org/dev/peps/pep-0008/#programming-recommendations)
Add an option in the Xanthos configuration file and code that allows for only certain outputs to be written to disk
It is unclear what the intent is of the HistFlag
configuration variable. The wiki says
If True, channel storage and soil moisture files are saved; if False, channel storage and soil moisture files are loaded when using the GWAM runoff module.
This isn't exactly correct, as the MRTM routing module also uses this flag to check whether to load channel storage values.
The example configuration file has
# HistFlag = True, historic mode ; = False, future mode
HistFlag = True
but it isn't clear what is meant by historic mode or future mode.
This is also the only place where the truthy value is "True"/"False" (although "true" is allowed in some cases, and "t", "y", "yes", and "1" are even allowed in one case). The preferred alternative would be 0/1 like the other flag options.
Hello,
A postdoc user of Xanthos pointed out this issue to me. They were trying to reproduce the example run but with user-provided PET and RUNOFF files, but a few things conspire to prevent this:
The documentation here describes a runoff_file
setting under the [Runoff] configuration tag. As far as I can see in the code, this setting is totally ignored. Instead, one needs to use the [Routing] [[mrtm]] alt_runoff
setting.
Even so, if pet_module = none
and runoff_module = none
, neither the pet_file
nor the alt_runoff
file get passed on to the routing module (or the result files).
I will submit a Pull Request that can fix this issue for this particular use case, but I'm not as familiar with the broader context of the model so you may wish to address in a different way!
Currently the output csvs (e.g. Basin_runoff_235_XXXX.csv
) have a final column header that specifies units (e.g., Unit (km^3/year)
. This header doesn't actually refer to a column of data in the table, making the csv difficult to work with. I think the unit needs to be either:
(1) included as a full column,
(2) included as a comment at the top of the file (best option in my opinion), or
(3) included in the file name.
Running Xanthos from the command line requires the first (only) argument to be the configuration file:
python xanthos/model.py example/pm_abcd_mrtm.ini
This errors because the flag -config_file
was not specified. We should assume any extra argument is the -config_file
parameter.
Dear all, I've been working with xanthos for the past month and I'm having trouble to understand the conversion between mm/year to km3/year in the output files. Running the example file (https://zenodo.org/record/2578287) both in km3peryear and mmperyear outputs options gives differentes results. For example: for Arctic Ocean Islands basin, in the year 1971, the run with km3 gives a runoff of 200.778857, and the run with mm gives 224395.703522.
If I understand correctly, 1 mm of water corresponds to 1 kg of water (in volume terms: 1mm x 1m x 1m), hence to convert it to km3 I would have to divide the mm quantity by 1e12. As it don't result in the same number, I divided the km3 column with the mm column in my dataset to see what is the ratio between them.
This gives a ratio between 0.001 and 0.003, it varies between basins and is correlated with the basin id. Hence, I thought that the conversion happens taking into account the area of the basin, but then mmperyear should not correspond to the total but the mean or something like that. However, the ratio varies slightly between years also, (for example: for Arctic Ocean Islands between 1971-2001 it varies between 0.001006 and 0.001143).
Would appreciate so much if someone can clarify this.
Thanks,
Hi! I've been using xanthos this past month and the Diagnostics Module print the following future warning:
INFO: ---Start Diagnostics:
/home/rcalvo/miniconda3/envs/xanthos/lib/python3.7/site-packages/xanthos/diagnostics/diagnostics.py:130: FutureWarning: Slicing a positional slice with .loc is not supported, and will raise TypeError in a future version. Use .loc with labels or .iloc with positions instead.
agg_df.loc[-1, 1:] = agg_df.sum(numeric_only=True)
Just a heads up.
Thanks.
There is a basin in the BasinNames235.txt file called "South America Colorad". I think this should be "Colorado".
Hi,
When i select csv and matlab format for the outputs, all is ok, but are problems with Netcdf option. I get this message in the last line of proccessing of the example data.
TypeError: '(slice(None, None, None), slice(None, None, None))' is an invalid key
How can i solve this issue ??
Thanks
Miguel
The current xanthos
can only output annual hydropower by GCAM region. Would it be possible to add the functionality to enable monthly hydropower generation by GCAM basin? (or by dam?) Thanks!
Xanthos' .csv outputs can be quite bulky, and aren't suited well to comparisons of many runs of Xanthos.
Let's try adding an output option for writing to a parquet file, which has good compression for column-based data and can be added to easily.
We should have unit tests. They can be set up without too much difficulty using unittest
, Python's testing framework.
In particular we should be able to test:
The configurations.py
file has a lot of repetition, and adding a new configuration requires duplicating a lot of this code. We can think of a better system for this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.