openenergyplatform / oedatamodel Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 3.0 2.12 MB

A common open energy data model (oedatamodel) and datapackage format for energy and scenario data

License: Creative Commons Zero v1.0 Universal

Python 100.00%

data-model database datapackage erm json oed

oedatamodel's People

Contributors

Stargazers

Watchers

Forkers

open-modex sedos-project areleu

oedatamodel's Issues

Normalize data table of normalization format

I suggest normalizing data table further into additional tables region, energy_vector and technology.
With following relations and columns:

region:
m:n relation to data table via data_region table
columns: id, name
energy_vector:
2x 1:n relation to data table columns input_energy_vector_id and output_energy_vector_id
columns: id, name
technology:
1:n relation to data table column technology_id
columns: id, technology, technology_type, parameter_name

This would lead to improved DB structure and smaller table data.
The user would not "feel" new structure, as the concrete format can be used as interface.
Additionally, checking for already existing data entries should be faster, improving uploading of new data via new relation format (m:n).
On the other hand, normalization format becomes a bit more unwieldy.

Update OEDatamodel to OEMetadata v1.5.1

Update metadata to v1.5.1
Release new OEDatamodel version v1.2.0
Update documentation

[Add CI] Validate datapackage

see OpenEnergyPlatform/oemetadata#49

Test and document as frictionless datapackage

Using the python package:

https://github.com/frictionlessdata/datapackage-py#package

Also see pyam docu:

https://pyam-iamc.readthedocs.io/en/stable/api/io.html#the-frictionless-data-package

Missing documentation and examples on column and datamodel usage

Add documentation to all column names describing the use of each column
Provide examples like an example dataset for each datamodel

Working Group on Data Management published data strcutures

https://www.h2020-bridge.eu/working-groups/data-management/

Develop a general scenario data format

There is a data format from the MODEX project FlexMex which is used in open_MODEX and SzenarienDB.

Discuss and develop a common datapackage format for scenario data.

Existing ideas:

Move and remove old versions from other repos

https://github.com/OpenEnergyPlatform/tutorial/tree/develop/data/datapackage/oep-scenario-data-datapackage

Collect requirements to improve OEDatamodel

In this issue requirements and ideas for improvement of the OEDatamodel are collected and discussed. The improvements should focus on the SEDOS project but are not limited to this context. We already have collected a few requirements within the course of project internal discussions:

enable versioning
enable bandwidths of values in scalar and timeseires (discussed in #52)
annotate reference dataset via ontology (OEO)
enable N:N relation between scenario table and scalar and timeseries tables (discussed in #46)

Column "year" missing in scalar?

No column "id" in tables of oedatamodel

Just wanted to post data into our oeddatamodel but upload fails with following response:
reason": "column \"id\" does not exist"

From https://oep-data-interface.readthedocs.io/en/latest/api/how_to.html#create-table it states, that tables have to have a primary key called "id". Thus, we have to change "scenario_id" in table "oed_scenario" to "id" (and all other tables). Unfortunately, there is no error message when creating the table without column "id" - maybe there should be one?!

Move input/output energy vector to Scenario model and relace with value?

Wrong foreign keys and missing schemas in datapackage

I tried to create ORM from metadata file (via https://github.com/OpenEnergyPlatform/oem2orm), but this failed due several minor things:

schema not given in FK
wrong PK/FK name in data table (scenario/data IDs switched)
invalid types (for oem2orm) "hstore", "decimal"
array[float] has to be written as float array in oem2orm (to be discussed!, as this looks not nice)

Update usage documentation and examples

As we see some usability issues while testing the oedatamodel_api within the open modex project, we would like to collect the feedback here and update the documentation and examples accordingly.

Support reuse of data in multiple scenarios

At the moment, data is always assigned to one scenario (1:n). To make it possible to reuse data in different scenarios, an m:n relationship for data and sceanrio tables should be built into the oedata model.

A solution for the normalization variation of the oedata model is to insert a new table between the data and scenario table that has 2 columns (scenario_id and data_id).
For the concreate variation of the oedatamodel it is more difficult to find a solution that also keeps the usability. Here the data tables are split into 2 tables to make it easier for the user to fill them out, so 2 new tables would have to be inserted between the data tables and the scenario table.
Another solution would be to make the concrete variation explicit only for data belonging to a single scenario.

It would be useful to find a solution for the concrete variation as well, lets discuss it here.

Create json schemas describing both data models

Overview

Hello, I have been working on integrating this data model without friction into frictionless. This requires effort in different repositories Here are the pull requests and issues:

Frictionless:
frictionlessdata/frictionless-py#1229
frictionlessdata/frictionless-py#1228

oemetadata:

OpenEnergyPlatform/oemetadata#91

If these changes take place then validating the models will work directly with frictionless. I think the next step is being able to validate the structure of the tables themselves, to do this we would need descriptors in json format.

Task

Write a data descriptors for the data model. These would comprise of the json schema representation of the packages:

I can contribute to this next month but if someone starts this that would be great.

Providing an OEP data-model template

[Top-Level Issue]
First draft of what needs to be done:

Develop a data model template for scenario data (example)
Establish a general naming convention for variable names
Documentation
Create a template SQL table
....

[Sub-Level Issues] Can be created and must be referenced here.

hstore vs. json vs. jsonb

I'm currently looking for an improvement for variable data structures (key-value-pairs).

Research:

Criterial:

Performance (Speed)
Performance (Storage size)
Usability (Syntax)

Examples:

OSM uses hstore.

I use this issue as documentation. Feel free to comment and improve!

Setup CI

Add a travis.yml to setup a CI-Pipeline.

The Pipeline must include some validation/test of the OEDataModel. See #17

Enable bandwidths for scalars and timeseries

A new OEDatamodel version shall be developed, which allows bandwidths for scalars and timeseries per scenario.
IMO this needs three implementation changes:

value in scalars must be changed to hold bandwidths (allowing fixed, discrete and continuous values)
multiple timeseries regarding same region/tech/etc. are allowed, but must be distinguishable by user
a concrete instance of the scenario using fixed values and fixed timeseries from scenario with bandwiths must be build

Possible implementations can be discussed here.

Add CI to repository