Giter Club home page Giter Club logo

wikidatazh_2019's Introduction

Wikidata ZH Hackathon 2019

Work in progress
Code
Data sources
Statistical Definitions
Sceenshots

Observable notebook with visualization of available data

We combine challenge 4 and 6. Our goal is to update the population of in wikidata based on open statistical data by the Kanton and the City of Zurich.

Team: Katharina Kaelin, Roman Karavia, Sebastian Windeck, Philipp Rütimann, Matthias Mazenauer, Michael Grüebler

Work in Progress 24.11.2019

Dataflow

Kanban

Code

Prerequisites

You will need python 3 and pip installed on your machine.

Install

Execute the steps below in a terminal.

  1. Create a virtual environment
python -m venv .venv
  1. Activate the virtual environment
source .venv/bin/activate
  1. Install the dependencies
pip install -r requirements.txt

Run

Run the Jupyter Notebook (work in progress)

jupyter notebook ProjectUpdate.ipynb

Run one of the python scripts, for example compare_data.py:

python compare_data.py

Data Sources

Population of Municipalities in the Canton of Zurich

Dataset description: Einwohnerbestand Ende Jahr nach zivilrechtlichem Wohnsitz (ab 2010 inkl. vorläufig Aufgenommene, die seit mehr als einem Jahr in der Gemeinde leben, aber ohne Wochenaufenthalter und Asylbewerber)

Get Data as csv from Canton of Zurich via API https://www.web.statistik.zh.ch:8443/gp/GP?type=EXPORT&indikatoren=133&raumtyp=1&text=yes

Data Format: JSON or CSV

Attributes:

Filter indikatoren=133 Bevölkerungs Personen / raumtyp=1 Alle Gemeinden

Technical Name Fiel Description Definition
RAUMEINHEIT_ID ID der Gemeinde
DATEN_VORHANDEN true/false Datenstand true/false für Datenstand
RAUMEINHEIT_NAME Gemeindename Name der Gemeinde
BFS BFS Gemeindenummer Eindeuteige Gemeindenummer der Gemeinde gemäss BFS
ALLE_WERTE Anzahl Personen pro Jahr

License: Open use. Must provide the source.

Source: https://opendata.swiss/de/dataset/bevolkerung-pers

Population of the City of Zurich

Dataset description: Wirtschaftliche Wohnbevölkerung der Stadt Zürich nach Statistischem Stadtquartier und Jahr, seit 1970. Datenqualität: 1970 – 1992 Fortschreibungsergebnisse, seit 1993 Bestand gemäss Register des Personenmeldeamtes. https://data.stadt-zuerich.ch/dataset/bev_bestand_jahr_quartier_seit1970_od3240/resource/570f006e-2f2a-4b1f-9233-c4916c753475

Data Format: CSV

Attributes:

Technical Name Fiel Description Definition
StichtagDatJahr Ereignisjahr Jahr
QuarSort Stadtquartier (Sort) Offizielle ID des Statistischen Stadtquartiers (Integer)
QuarLang Stadtquartier Name des Statistischen Stadtquartiers (String)
AnzBestWir Wirtschaftliche Bevölkerung Wirtschaftlich anwesende Personen (Integer)

License: Creative Commons CCZero

Source: https://data.stadt-zuerich.ch/dataset/bev_bestand_jahr_quartier_seit1970_od3240

Mapping of Quarters to wikidata Entitites

Dataset description: Matchingtabelle Quartiernummern zu Wikidata-ID. Diese Liste wurde am Wikimedia Hackathon 2014 in Zürich erstellt. Sie dient zur Verknüpfung zwischen statistischen Quartiernummern und den Wikidata-Item-IDs.

https://data.stadt-zuerich.ch/dataset/matchingtab-quartnr-wikidataid/resource/0090f2ed-1df9-4953-9561-5d413fd74758

Data Format: CSV

Technical Name Fiel Description Definition
QNr Quartiernummer Offizielle ID der statistischen Quartiere
QName Quartiername Offizieller Name der statistischen Quartiere
DataItemID DataItemID Offizielle ID der Statistischen Quartiere für Wikidata

License: Creative Commons CCZero

Source: https://data.stadt-zuerich.ch/dataset/matchingtab-quartnr-wikidataid

Statistical Definition

The canton uses the "ständige" and the city of zurich the "wirtschaftliche" definition to count the population.

Bevoelkerungsdefinition

More about this definition: https://www.stadt-zuerich.ch/prd/de/index/statistik/themen/bevoelkerung/bevoelkerungsentwicklung/bevoelkerungsdefinition.html

Wikidata Definition

Important Identifiers in Wikidata (SPARQL terminology):

Name Wikidata ID
Stadt Zürich Q72
Population P1082
Quartier Q19644586
Gemeinde Q70208
im Kanton ZH Q11943
Date (Year) P585
URL der Fundstelle P854
Determination method P459
Statistik Q12483
Preferred Rank wikibase:rank

Source for Gemeinden:
https://statistik.zh.ch/internet/justiz_inneres/statistik/de/daten/gemeindeportraet_kanton_zuerich.html#a-content

Source for Quartiere:
https://data.stadt-zuerich.ch/dataset/bev_bestand_jahr_quartier_seit1970_od3240

Limitations

  • Only municipalities are updated, which are currently active (source: geo.admin.ch)
  • If a wikidata entry with a date other than 31. dec exists (eg. 1.1.2017 or 2017), a new entry will be made anyway
  • For the City of Zurich only Quartiere are selected but not Kreise
  • The bot does not yet work and the current code will delete all population data before inserting the new record

Screenshots

Manual run of the bot only for the Quarter Höngg

Höngg Wikidata

Höngg Wikipedia

Höngg Siri

Code

**Function: ** Description: :return:

city_of_zurich.py

**Function: import_StadtZH_api ** Description: :return:

Function: import_kantonZH_api Description:
:return:

Function: import_swisstopowikidata_kantonZH Description: Validation of BFS number: Checks which BFS numbers are active between geo.admin and wikidata :return:

Function: import_wikidata_kantonZH Description: Generates a SPARQL query and converts this data to a pandas datadframe :return: pd.Dataframe['wikidata_id','date','population','qualifier']

Function: import_wikidata_StadtZH Description: Generates a SPARQL query and converts this data to pandas datadframe :return: pd.Dataframe['wikidata_id','date','population','qualifier']

main Description: :return:

compare_data.py

Function: compare_dfs_kt DescriptionThis functions checks all entries from ZH Kanton Data and compares it with the wikidata :param df_wikidata: :param df_statdata: :return: list of index in df_statdata not in df_wikidata --> need to be uploaded to wikidata

Function: compare_dfs_gm Description: :return:

main Description: :return:

insert_wikidata.py

Function: create_reference Description: :return:

Function: create_qualifier Description: :return:

Function: insert_wikidate Description: :return:

ProjectUpdate.ipynb

WikiBot.ipynb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.