Comments (5)
@zoltanmaric I am impressed how fast you are catching up with the atlite code! Definitely makes sense!
from atlite.
Interesting, but I would not bother with the value differences for different time resolutions, since it is related to ERA5 internals only.
from atlite.
The culprit is this attempt to infer the frequency of the time index:
atlite/atlite/datasets/era5.py
Lines 157 to 174 in a0bd4b0
An ERA5 CDS request spanning June 30 and July 1st looks like this:
{
'product': 'reanalysis-era5-single-levels',
'year': '2022',
'month': [6, 7],
'day': [30, 1],
'time': ['00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00', '07:00', '08:00', '09:00', '10:00', '11:00', '12:00', '13:00', '14:00', '15:00', '16:00', '17:00', '18:00', '19:00', '20:00', '21:00', '22:00', '23:00']
}
which returns results covering June 1st, June 30th, July 1st, and July 30th:
Time index contents
From get_data_influx
print(ds.time)
Output:
<xarray.DataArray 'time' (time: 96)>
array(['2022-06-01T00:00:00.000000000', '2022-06-01T01:00:00.000000000',
'2022-06-01T02:00:00.000000000', '2022-06-01T03:00:00.000000000',
'2022-06-01T04:00:00.000000000', '2022-06-01T05:00:00.000000000',
'2022-06-01T06:00:00.000000000', '2022-06-01T07:00:00.000000000',
'2022-06-01T08:00:00.000000000', '2022-06-01T09:00:00.000000000',
'2022-06-01T10:00:00.000000000', '2022-06-01T11:00:00.000000000',
'2022-06-01T12:00:00.000000000', '2022-06-01T13:00:00.000000000',
'2022-06-01T14:00:00.000000000', '2022-06-01T15:00:00.000000000',
'2022-06-01T16:00:00.000000000', '2022-06-01T17:00:00.000000000',
'2022-06-01T18:00:00.000000000', '2022-06-01T19:00:00.000000000',
'2022-06-01T20:00:00.000000000', '2022-06-01T21:00:00.000000000',
'2022-06-01T22:00:00.000000000', '2022-06-01T23:00:00.000000000',
'2022-06-30T00:00:00.000000000', '2022-06-30T01:00:00.000000000',
'2022-06-30T02:00:00.000000000', '2022-06-30T03:00:00.000000000',
'2022-06-30T04:00:00.000000000', '2022-06-30T05:00:00.000000000',
'2022-06-30T06:00:00.000000000', '2022-06-30T07:00:00.000000000',
'2022-06-30T08:00:00.000000000', '2022-06-30T09:00:00.000000000',
'2022-06-30T10:00:00.000000000', '2022-06-30T11:00:00.000000000',
'2022-06-30T12:00:00.000000000', '2022-06-30T13:00:00.000000000',
'2022-06-30T14:00:00.000000000', '2022-06-30T15:00:00.000000000',
'2022-06-30T16:00:00.000000000', '2022-06-30T17:00:00.000000000',
'2022-06-30T18:00:00.000000000', '2022-06-30T19:00:00.000000000',
'2022-06-30T20:00:00.000000000', '2022-06-30T21:00:00.000000000',
'2022-06-30T22:00:00.000000000', '2022-06-30T23:00:00.000000000',
'2022-07-01T00:00:00.000000000', '2022-07-01T01:00:00.000000000',
'2022-07-01T02:00:00.000000000', '2022-07-01T03:00:00.000000000',
'2022-07-01T04:00:00.000000000', '2022-07-01T05:00:00.000000000',
'2022-07-01T06:00:00.000000000', '2022-07-01T07:00:00.000000000',
'2022-07-01T08:00:00.000000000', '2022-07-01T09:00:00.000000000',
'2022-07-01T10:00:00.000000000', '2022-07-01T11:00:00.000000000',
'2022-07-01T12:00:00.000000000', '2022-07-01T13:00:00.000000000',
'2022-07-01T14:00:00.000000000', '2022-07-01T15:00:00.000000000',
'2022-07-01T16:00:00.000000000', '2022-07-01T17:00:00.000000000',
'2022-07-01T18:00:00.000000000', '2022-07-01T19:00:00.000000000',
'2022-07-01T20:00:00.000000000', '2022-07-01T21:00:00.000000000',
'2022-07-01T22:00:00.000000000', '2022-07-01T23:00:00.000000000',
'2022-07-30T00:00:00.000000000', '2022-07-30T01:00:00.000000000',
'2022-07-30T02:00:00.000000000', '2022-07-30T03:00:00.000000000',
'2022-07-30T04:00:00.000000000', '2022-07-30T05:00:00.000000000',
'2022-07-30T06:00:00.000000000', '2022-07-30T07:00:00.000000000',
'2022-07-30T08:00:00.000000000', '2022-07-30T09:00:00.000000000',
'2022-07-30T10:00:00.000000000', '2022-07-30T11:00:00.000000000',
'2022-07-30T12:00:00.000000000', '2022-07-30T13:00:00.000000000',
'2022-07-30T14:00:00.000000000', '2022-07-30T15:00:00.000000000',
'2022-07-30T16:00:00.000000000', '2022-07-30T17:00:00.000000000',
'2022-07-30T18:00:00.000000000', '2022-07-30T19:00:00.000000000',
'2022-07-30T20:00:00.000000000', '2022-07-30T21:00:00.000000000',
'2022-07-30T22:00:00.000000000', '2022-07-30T23:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2022-06-01 ... 2022-07-30T23:00:00
Attributes:
long_name: time
Because the time index is not continuous, pd.infer_freq
returns None
, resulting in a time shift of minus 12 hours.
from atlite.
I think the fix should be as simple as just always setting the time shift to 30 minutes (proposed diff).
The get_data
function of era5.py
already has the reanalysis-era5-single-levels
hard-coded as the product:
atlite/atlite/datasets/era5.py
Lines 352 to 359 in a0bd4b0
To my understanding, this product's resolution is always hourly (see docs), so there's no reason to attempt to infer a different frequency.
What do you think @euronion @FabianHofmann ?
from atlite.
Comparing ERA5 Values Requested at Different Time Samplings
Here's a comparison of values received from ERA5 at 1h, 2h, 3h, 4h, and 6h sampling:
Code to Generate the Above Table
import xarray
import atlite.datasets.era5 as era5
from dask.utils import SerializableLock
import functools
# Create lists of hours like [00:00, 01:00, 02:00, ...], sampled
# every hour, every 2 hours, etc.
time_sampling = {}
for rate in [1, 2, 3, 4, 6]:
time_sampling[rate] = [f"{hour:02}:00" for hour in range(0, 24, rate)]
retrieval_params = {
'product': 'reanalysis-era5-single-levels',
'area': [57.0, -0.5, 56.0, 0.5],
'chunks': {'time': 100},
'grid': [0.25, 0.25],
'tmpdir': '/tmp',
'lock': SerializableLock(),
'year': '2013',
'month': [1],
'day': [1]
}
param_sets = {hour: {**retrieval_params, **{'time': time}} for hour, time in time_sampling.items()}
def retrieve_data_for_single_raster(params: dict) -> "xarray.DataSet":
variable = [
"surface_net_solar_radiation",
"surface_solar_radiation_downwards",
"toa_incident_solar_radiation",
"total_sky_direct_solar_radiation_at_surface",
]
ds = era5.retrieve_data(variable=variable, **params)
return ds.sel(latitude=56, longitude=0).load().to_dataframe()[["ssr", "ssrd", "tisr", "fdir"]]
# Retrieve ERA5 data for each different time sampling
raw_ds = {hour: retrieve_data_for_single_raster(params) for hour, params in param_sets.items()}
def join_dfs(left_sampling_and_df, right_sampling_and_df):
left_sampling, left_df = left_sampling_and_df
right_sampling, right_df = right_sampling_and_df
suffix = f"_{right_sampling}h"
return left_sampling, left_df.join(right_df, on='time', how='left', rsuffix=suffix)
# Merge all data into a single dataframe to show differences in values
# for each hour next to each other
_, merged = functools.reduce(join_dfs, raw_ds.items())
samplings_compared = merged.sort_index(axis='columns')\
.query('time.dt.hour > 7 and time.dt.hour < 18')\
.astype(float)\
.round(decimals=2)
# Remove date part
samplings_compared.index = samplings_compared.index.time
samplings_compared.to_csv('/tmp/samplings_compared.csv')
While this does show that the time shift is not proportional to the sampling, there's an additional weird twist to it. The values for 2h, 3h, 4h, and 6h sampling seem to be equal - but the values for 1h sampling are slightly different (by less than 1% at noon).
It would be interesting to find out why that is, but as far as this issue is concerned - I think it's still more appropriate to always shift the time by 30 minutes, than by half of the sampling interval.
from atlite.
Related Issues (20)
- Read from url for `excluder.add_raster` and `excluder.add_geometry` HOT 2
- PV conversion: New model based on Bloomfield et al. (2019)
- Wind Conversion: potential bug when power curve does not end with zero after cutout speed HOT 1
- Weather/climate data variable descriptions (for alternate model data use) HOT 2
- Problems with `convert_and_aggregate` for long timespans? HOT 12
- Data type error when building cutout with SARAH v3 HOT 4
- Merging cutouts / Integrate downloaded SARAH data into existing ERA-5 cutout HOT 1
- Setting the "capacity_factor_timeseries = True" the results seem not to change HOT 2
- Time misalignment between ERA5 and SARAH? HOT 6
- "Optimal" orientation does always lead to more output HOT 4
- Migrate to new CDS infrastructure
- CSP has capacity factors > 1 when passing shapes HOT 7
- `compute_shape_availability` triggers numpy failure: Python integer 255 out of bounds for int8 HOT 3
- Unable to generate SARAH cutouts HOT 1
- The problem of calculating average weather using normalized layout HOT 5
- Dramatically different download speeds between versions HOT 15
- Allow wind speeds to be corrected by a location-specific linear factor (e.g. from GWA)
- export/save function for final eligible area
- Add historical forecast data e.g. day-ahead
- `atlite` is incompatible with `cdsapi` v0.7.3 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from atlite.