brightway-lca / bw_hybrid Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 1.0 551 KB

Hybrid (Input-Output/Process-based) Life-Cycle Assessment

Home Page: https://docs.brightway.dev/projects/hybrid/

License: BSD 3-Clause "New" or "Revised" License

Python 1.37% Jupyter Notebook 98.59% XSLT 0.05%

lca life-cycle-assessment

bw_hybrid's Introduction

Brightway Hybrid

Warning

This package is work in progress.
For further questions, contact @michaelweinold

This package provides tools for the hybridization of inventory databases and multi-regional input-output tables.

📚 Literature (Excerpt):

Agez et al., 2020, Agez et al., 2019

📁 Related Repositories:

pylcaio

bw_hybrid's People

Contributors

Stargazers

Watchers

Forkers

cmutel

bw_hybrid's Issues

Create data structure input/output documentation document

Notebook/document should contain i/o structures and schema of tables/matrices.

`ecospold2matrix` Data Flowchart

Ecospold xml Ingestion

flowchart TD
Ain[IntermediateExchanges.xml] --> A["extract_products()"] --> Aout[products: pd.DataFrame]
Bin[ActivityIndex.xml <br> ActivityNames.xml] --> B["extract_activities()"] -->  Bout[activities: pd.DataFrame]
C["get_labels()"] --> Cextract1 --> Cout[PRO: pd.DataFrame <br> STR: pd.DataFrame]
Cin[ElementaryExchanges.xml <br> spold files] --> C --> Cextract2 --> Cout
Din[spold files] --> D["get_flows()"] --> Dextract["extract_flows()"] --> Dout[inflows: pd.DataFrame <br> outflows: pd.DataFrame <br> elementary_flows: pd.DataFrame]
Cextract1["build_PRO()"]
Cextract2["build_STR()"]

DataFrame Table

object	type	columns (e2m)	columns correspondence (bw) or Ecospold	comment
`inflows`	`pd.DataFrame`	fileId sourceActivityId productId amount row_index	`.spold` file name activityLinkId intermediateExchangeId amount ???	extracted from `.spold` files
`outflows`	`pd.DataFrame`	fileId productId amount productionVolume outputGroup	`.spold` file name intermediateExchangeId amount productionVolume outputGroup	extracted from `.spold` files
`elementary_flows`	`pd.DataFrame`	fileId elementaryExchangeId amount	`.spold` file name elementaryExchangeId amount	extracted from `.spold` files
`activities`	`pd.DataFrame`	activityId activityNameId activityType startDate endDate activityName	id activityNameId activityType startDate endDate activityName	extracted from `ActivityIndex.xml` with activityName data merged from `ActivityNames.xml`
`products`	`pd.DataFrame`	productName unitName productId unitId cpc properties	name unitName id unitId classification == 'cpc' properties	extracted from `IntermediateExchanges.xml`
`STR`	`pd.DataFrame`	id name unit cas comp subcomp	id name unitName casNumber compartment subcompartment	extracted from `ElementaryExchanges.xml`
`PRO`	`pd.DataFrame`	'activityId' 'productId' 'activityName' 'ISIC' 'price' 'priceUnit' 'EcoSpoldCategory' 'geography' 'technologyLevel' 'macroEconomicScenario' properties_x 'productionVolume' 'productName' 'unitName' 'cpc' properties_y 'activityNameId' 'activityType' 'startDate' 'endDate' 'activityName_duplicate'	'id' 'productId' 'activityName' 'ISIC' 'price' 'priceUnit' 'EcoSpoldCategory' 'geography' 'technologyLevel' 'macroEconomicScenario' properties_x 'productionVolume' 'productName' 'unitName' 'cpc' properties_y 'activityNameId' 'activityType' 'startDate' 'endDate' 'activityName_duplicate'	extracted from `.spold` files

Preparation and Cleanup

flowchart TD
in[activities: pd.DataFrame <br> products: pd.DataFrame] --> F["complement_labels()"] --> out[PRO: pd.DataFrame <br> STR: pd.DataFrame]

DataFrame Table

object	type	columns (e2m)	columns correspondence (bw) or Ecospold	comment
`PRO`	`pd.DataFrame`	all prev. columns 'productionVolume' all cols. from `products` all cols. from `activities`	all prev. columns 'productionVolume' all cols. from `products` all cols. from `activities`	for merge keys, see below

Join Table

left	right	left_key	right_key	added cols.
`PRO`	`outflows`	`index` = 'abc'	`index` = 'abc'	'productionVolume'
`PRO`	`products`	`index` = 'abc'	`index` = 'abc'	all except potential duplicates
`PRO`	`activities`	`index` = 'abc'	`index` = 'abc'	all except potential duplicates

DataFrame Construction (change heading)

flowchart TD
in[inflows: pd.DataFrame <br> elementary_flows: pd.DataFrame <br> outflows: pd.DataFrame] --> F["build_AF()"] --> out[A: pd.DataFrame <br> F: pd.DataFrame]

DataFrame Table

Pivot Table

output	input	index	columns	values	output index	output cols.
`A`	`inflows`	'row_index' = 'fileId' + 'productId'	'fileId'	'amount'	`PRO`.index = 'abc'	`PRO`.index = 'abc'
`F`	`elementary_flows`	'elementaryExchangeId'	'fileId'	'amount'	`STR`.index = 'abc'	`PRO`.index = 'abc'

Characterization

flowchart TD
in["LCIA Implementation v3.8.xlsx"] --> F1["if-else"]  --> F2["simple_characterisation_matching()"] --> out[A: pd.DataFrame <br> C: pd.DataFrame]

Pivot Table

output	input	index	columns	values	output index	output cols.
`C`	`C_long`	'impact_label'	'stressorId'	'CF'	N/A	N/A

`pylcaio.LCAIO.update_prices_electricity`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Database Loader: Environmental Extensions (`complete_extensions`)

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Auxiliary Function: `get_inflation`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

`pylcaio.LCAIO.extend_inventory`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

`pylcaio.LCAIO.apply_scaling_without_prices` ("CONVERSION PART")

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

`pylcaio.LCAIO.identify_rows`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb Cell 16 in <cell line: 1>()
      [3](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2) else:
      [4](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3)     parser = e2m.Ecospold2Matrix(
      [5](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4)         sys_dir = path_ecoinvent_local,
      [6](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5)         project_name = 'ecoinvent_3_5_cutoff',
   (...)
      [9](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8)         positive_waste = False,
     [10](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9)         nan2null = True)
---> [11](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10)     parser.ecospold_to_Leontief(
     [12](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11)         fileformats = 'Pandas',
     [13](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=12)         with_absolute_flows=True)
     [14](vscode-notebook-cell://ssh-remote%2Bhetzner/home/weinold/github/pylcaio_integration_with_brightway/notebooks/main.ipynb#X21sdnNjb2RlLXJlbW90ZQ%3D%3D?line=13)     ecoinvent = read_ecoinvent_pickle(path_e2m_project)

File ~/miniconda3/envs/pylcaio/lib/python3.9/site-packages/ecospold2matrix/ecospold2matrix.py:398, in Ecospold2Matrix.ecospold_to_Leontief(self, fileformats, with_absolute_flows, lci_check, rtol, atol, imax, characterisation_file, ardaidmatching_file)
    396 self.extract_activities()
    397 self.get_flows()
--> 398 self.get_labels()
    400 # Clean up if necessary
    401 self.__find_unsourced_flows()

File ~/miniconda3/envs/pylcaio/lib/python3.9/site-packages/ecospold2matrix/ecospold2matrix.py:617, in Ecospold2Matrix.get_labels(self)
    612         self.log.info(msg.format('Labels', filename, sha1))
    614 # OR EXTRACT FROM ECOSPOLD DATA...
    615 else:
--> 617     self.build_PRO()
    618     self.build_STR()
    620     # and optionally pickle for further use

File ~/miniconda3/envs/pylcaio/lib/python3.9/site-packages/ecospold2matrix/ecospold2matrix.py:1166, in Ecospold2Matrix.build_PRO(self)
   1164     PRO.loc[file_index, ['price', 'priceUnit']] = [price, price_unit]
   1165 # Or complain if price already exists
-> 1166 elif not np.allclose([price_org], [price]):
   1167     print("WARNING: We have heterogeneous prices")
   1168 else:

File <__array_function__ internals>:180, in allclose(*args, **kwargs)

File ~/miniconda3/envs/pylcaio/lib/python3.9/site-packages/numpy/core/numeric.py:2265, in allclose(a, b, rtol, atol, equal_nan)
   2194 @array_function_dispatch(_allclose_dispatcher)
   2195 def allclose(a, b, rtol=1.e-5, atol=1.e-8, equal_nan=False):
   2196     """
   2197     Returns True if two arrays are element-wise equal within a tolerance.
   2198 
   (...)
   2263 
   2264     """
-> 2265     res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
   2266     return bool(res)

File <__array_function__ internals>:180, in isclose(*args, **kwargs)

File ~/miniconda3/envs/pylcaio/lib/python3.9/site-packages/numpy/core/numeric.py:2372, in isclose(a, b, rtol, atol, equal_nan)
   2369     dt = multiarray.result_type(y, 1.)
   2370     y = asanyarray(y, dtype=dt)
-> 2372 xfin = isfinite(x)
   2373 yfin = isfinite(y)
   2374 if all(xfin) and all(yfin):

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Implement code formatting GitHub Action

...for instance:

Database Loader: Document Matrices

Document in 2022 Publications.

Exiobase

Ecoinvent

`pylcaio.LCAIO.save_system`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Library and repo names are different (bw2hybrid versus brightway2-hybridization); moreover this library should not be compatible with Brightway 2, but the "next generation", partially available already. Therefore, I suggest that we change both repo and library names to either bw_hybridization, bw_hybrid, or bw_heuristic_hybridization.

`pylcaio.LCAIO.calc_productions`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Auxiliary Function: `completing_extensions`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Switched variables names in `1_pylcaio_to_brightway.ipynb` notebook

path_file_ecoinvent_raw and path_file_exiobase_raw are flipped:

exiobase: pymrio.IOSystem = pymrio.parse_exiobase3(path_dir_exiobase_raw / str_exiobase_zip_file)
with open(path_file_ecoinvent_raw, 'wb') as file_handle:
    pickle.dump(obj = exiobase, file = file_handle, protocol=pickle.HIGHEST_PROTOCOL)

Integrate `bw_exiobase`

https://github.com/brightway-lca/bw_exiobase

Database Loader: Capital Endogenization (`path_to_capitals`)

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

CCL Deliverable: Database

`pylcaio.LCAIO.hybridize`: "CONVERSION PART"

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Documentation

`pylcaio` variable	domain	official name	status	comment
`A_io`	IO	A matrix	ongoing
`Y_io`	IO	final demand matrix	ongoing
`F_io`	IO	satellite account coefficient matrix	ongoing	usually named `S`
`PRO_f`	PDB	LCA database metadata	ongoing
`A_ff`	PDB	???	not yet started
`F_f`	PDB	environmental matrix	ongoing

`pylcaio.LCAIO.low_production_volume_processes`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

`pylcaio.DatabaseLoader.combine_ecoinvent_exiobase`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Create list of additional (non-essential) `pylcaio` functionality

'regionalization of impacts' (compare complete_extensions=True)
'capital goods' (compare path_to_capitals=True)
'more environmental extensions' (compare impact_world=True and complete_extensions=True)

Document 'pylcaio.LCAIO.aggregate()` function

...compare also: Agez et al., 2020, Agez et al., 2019

`pylcaio.LCAIO.extract_*`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Move Static Data Files to zenodo.org

For better management of the data files, move all static data files to zenodo.org and use zenodo_get to download the files.

Database Loader: Regionalization (`regionalized`)

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Matrix Inversion (Solution of Linear System) for Dense Hybrid Matrices

@MaximeAgez kindly provided a pickled hybrid matrix for testing (it has been almost two years since I last managed to run pylcaio): PolyBox Link

We will likely need to adapt bw2calc.lca_base.solve_linear_system() to allow for the inversion of large non-sparse/dense matrices.

pylcaio is already using a Schur complement ("block matrix inversion formula") somewhere... this is likely the way to go.

Add GitHub actions

Best Practices:

refurb

`KeyError: 'method'` when reading pickled hybridized databases using `pylcaio.Analysis`

When attempting to read a pickled pylcaio.LCAIO class instance with the pylcaio.Analysis method, the following error is thrown:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/pylcaio/src/pylcaio.py:1915, in Analysis.__init__(self, path_to_hybrid_system)
   1914 try :
-> 1915     self.C = pd.concat([pd.DataFrame(self.C_f.todense(), index=self.IMP['method'], columns=self.STR['MATRIXID']),
   1916                pd.DataFrame(self.C_io.todense(), index=self.impact_categories_IO,
   1917                             columns=self.flows_of_IO)]).fillna(0)
   1918 except KeyError:

KeyError: 'method'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
Cell In [10], line 1
----> 1 Analysis = pylcaio.Analysis(path_file_hybrid)

File ~/pylcaio/src/pylcaio.py:1919, in Analysis.__init__(self, path_to_hybrid_system)
   1915     self.C = pd.concat([pd.DataFrame(self.C_f.todense(), index=self.IMP['method'], columns=self.STR['MATRIXID']),
   1916                pd.DataFrame(self.C_io.todense(), index=self.impact_categories_IO,
   1917                             columns=self.flows_of_IO)]).fillna(0)
   1918 except KeyError:
-> 1919     self.C = pd.concat([pd.DataFrame(self.C_f.todense(), index=self.IMP['method'], columns=self.STR['MATRIXID']),
   1920                pd.DataFrame(self.C_io.todense(), index=self.impact_categories_IO,
   1921                             columns=self.flows_of_IO)]).fillna(0)
   1922 self.C_index = self.C.index.tolist()
   1923 self.C = back_to_sparse(self.C)

KeyError: 'method'

"TypeError: SparseArray does not support item assignment via setitem" for Ecospold2Matrix(positive_waste = True)

Compare majeau-bettez/ecospold2matrix#13.
Current workaround: set positive_waste = False

Notebook `1_pylcaio_to_brightway.ipynb` doesn't include complete dependencies

Libraries pymrio and ecospold2matrix not installed in indicated environment file.

`pylcaio.LCAIO.correct_double_counting`

Documentation:

add data input/output format documentation
add mathematical formulation of data manipulation

Refactoring:

implement function using native Pandas/NumPy objects and methods
document performance improvements
create MWE input/output data for unit test

Convert Static Data Files to `json` and `csv`

Currently, data files are of formats txt, csv, xlsx. To avoid using unsafe functions like eval() and to simpify data import, all data should be converted to either json or csv.

File names should indicate the target Python type (e.g.: dict_ecoinvent_aux_data.json or list_geography_correspondence.json.

brightway-lca / bw_hybrid Goto Github PK

bw_hybrid's Introduction

Brightway Hybrid

📚 Literature (Excerpt):

📁 Related Repositories:

bw_hybrid's People

Contributors

Stargazers

Watchers

Forkers

bw_hybrid's Issues

Ecospold xml Ingestion

DataFrame Table

Preparation and Cleanup

DataFrame Table

Join Table

DataFrame Construction (change heading)

DataFrame Table

Pivot Table

Characterization

Pivot Table

Exiobase

Ecoinvent

Task List

Recommend Projects

Recommend Topics

Recommend Org