Giter Club home page Giter Club logo

cosmogony's Introduction

Cosmogony

Github workflow Crates.io Crates.io

This is home to Cosmogony, a project that aims at providing an efficient tool to quickly use and update worldwide geographical regions. It returns geographical zones with a structured hierarchy to easily know that Paris is a city in the state Île-de-France in the country France. The architecture of Cosmogony is based on OpenStreetMap data and on the exploitation of well defined libpostal rules to type each zone according to its country. Then the resulting hierarchy is built thanks to geographical inclusion. An example of a full data extract can be browsed at http://cosmogony.world.

To explore and navigate fluently in the built hierarchy, Cosmogony comes along with two other tools:

Below is a brief visualisation of a basic use case of the Cosmogony Explorer:

Getting started

Get data

🚧 Until we propose a direct data download, you have to extract your geographic regions by yourself (see below). 🚧

Use data

The best way to explore the data (i.e. the coverage, the zones metadata, the hierarchy...) is our Cosmogony Explorer

🚧 In the future, we may create other tools to use the data. Please share your ideas and needs in the issues. 🚧

Extract data

You can build cosmogony to extract the regions on your own.

  • Build

Here are the necessary manual steps to build cosmogony :

curl https://sh.rustup.rs -sSf | sh    # intall rust
apt-get install libgeos-dev            # install GEOS
git clone https://github.com/osm-without-borders/cosmogony.git     # Clone this repo
cd cosmogony;                          # enter the directory
git submodule update --init            # update the git submodules
cargo build --release                  # finally build cosmogony
  • Run

You can now grab some OSM pbf and extract your geographic zones: cargo run --release -- generate -i /path/to/your/file.osm.pbf

Check out cosmogony help for more options: cargo run --release -- -h

  • Other subcomands

Note: the default subcommand is the generate subcommand, so cosmogony -i <osm-file> -o output file if the same as cosmogony generate -i <osm-file> -o output file

  • Merging cosmogonies

To generate a world cosmogony on a server withtout a lot of RAM, you can generate cosmogonies on split non overlapping osm files, without a shared parent (eg. split by continent or country) and merge the generated cosmogony.

To merge several cosmogonies into one you can use the custom subcommand merge: cargo run --release -- merge *.jsonl -o merged_cosmo.jsonl

Note: to reduce the memory footprint, it can only merge json lines cosmogonies (so .jsonl or .jsonl.gz).

Documentation

The initial purpose of Cosmogony is to enhance mimir, our geocoder (See the founding issue for a bit of context). Another common use case is to create geospatially aware statistics, such as choropleth maps. Anyway, we'd love to know what you've built from this, so feel free to add your use cases in Awesome Cosmogony.

Data sources and algorithm

OpenStreetMap (OSM) seems the best datasource for our use case. However the OSM administrative regions (admins) have several drawbacks:

  • admin_level: The world is a complex place where each country has its own administrative division. OSM uses an admin_level tag, with values ranging from 1 to ~10 to allow consistent rendering of the borders among countries. This is fine for making maps, but if you want a world list of cities or regions, you still need local and specific knowledge to find which admin_level to use in each country.
  • no existing hierarchy: indeed the OSM data model rests only on nodes, ways and relation without any structure.

To mitigate this, the general idea is to take an OSM pbf file and to:

  • use a geometric algorithm to define which admin belongs to another admin (we'll start with shapes exact inclusion and see if that's enough).
  • use the libpostal rules to type the admin depending on its country.

OSM administrative regions may not be mapped with the same precision all over the earth but the data is easy to update and the update will benefit the community.

Beyond OSM, we will possibly consider in the future using other data sources (with compliant license). However we don't want cosmogony to be too complex (as the great WhosOnFirst is (see below)

Administrative types

The libpostal types seem nice (and made by brighter people than us):

  • suburb: usually an unofficial neighborhood name like "Harlem", "South Bronx", or "Crown Heights"
  • city_district: these are usually boroughs or districts within a city that serve some official purpose e.g. "Brooklyn" or "Hackney" or "Bratislava IV"
  • city: any human settlement including cities, towns, villages, hamlets, localities, etc.
  • state_district: usually a second-level administrative division or county.
  • state: a first-level administrative division. Scotland, Northern Ireland, Wales, and England in the UK are mapped to "state" as well (convention used in OSM, GeoPlanet, etc.)
  • country_region: informal subdivision of a country without any political status
  • country: sovereign nations and their dependent territories, anything with an ISO-3166 code.

Names and Labels

Cosmogony reads OSM tags to determine names and labels for all zones, in all available languages.

In addition to name:* tags from boundary objects themselves, other names from related objects are used as they may provide more languages :

  • nodes with role label (if present)
  • nodes with role admin_center (if relevant: for cities, or on matching wikidata ID)

Note that these additional name:* values are included in zone tags in the output to help reusing, even if they are not part of the OSM object tags.

Output schema

Below is a brief example of the information contained in the cosmogony output.

{
	"zones":[
		{"id":0,
		"osm_id":"relation:110114",
		"admin_level":8,
		"zone_type":"city",
		"name":"Sand Rock",
		"zip_codes":[],
		"center":{"coordinates":[-85.77153961457083,34.2303942501858],"type":"Point"},
		"bbox": [-85.803571, 34.203915, -85.745058, 34.26666],
		"geometry":{
			"coordinates":"..."
		},
		"tags":{
			"admin_level":"8",
			"border_type":"city",
			"boundary":"administrative",
			"is_in":"USA"
		},
		"parent":"null",
		"wikidata":"Q79669"}
	],
		"meta":{
			"osm_filename":"alabama.osm.pbf",
			"stats":{"level_counts":{"6":64,"8":272},
			"zone_type_counts":{"City":272,"StateDistrict":64},
			"wikidata_counts":{"6":58,"8":202},
			"zone_with_unkwown_country_rules":{},
			"unhandled_admin_level":{},
			"zone_without_country":0}
		}
}

Dataset quality test

You can check the cosmogony file built with our Cosmogony Data Dashboard.

🚧 Ideas and other contributions welcomed in issue #4 🚧

Contribute

Cosmogony, just like OpenStreetMap, emphasizes local knowledge: even if you can't code, you can help us to make Cosmogony go worldwide 🚀

If the cosmogony of your country does not look good, here is what you can do to fix it:

Tell us which administrative zones are relevant and how to extract them from OSM

Tell us how many administrative zones are expected

See also

deprecated, and without cascading hierarchy

Our main inspiration source 💖 Hard to maintain because of the many sources involved that needs deduplication and concordances, difficult to ensure a coherent hierarchy (an object Foo can have an object Bar as a child whereas Foo is not listed as a parent of Bar), etc

Pretty cool if you just need to inspect the coverage or export a few administrative areas. Still need country specific knowledge to use worldwide.

Without cascading hierarchy. Duno if it's up to date, and how we can contribute.

Licenses

All code in this repository is under the Apache License 2.0.

This project uses OpenStreetMap data, licensed under the ODbL by the OpenStreetMap Foundation. You need to visibly credit OpenStreetMap and its contributors if you use or distribute the data from cosmogony. Read more on OpenStreetMap official website.

cosmogony's People

Contributors

amatissart avatar antoine-de avatar crocme10 avatar dependabot[bot] avatar guillaumegomez avatar jbgriesner avatar ncraley avatar nlehuby avatar remi-dupre avatar sdrll avatar severo avatar tristramg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cosmogony's Issues

Need help with mapping country regions for COVID tracking

Hi, I'm a contributor to coronadatascraper, an open source project aiming to scrape official websites all around the world for COVID numbers.

My problem is that I'm trying to find a system which we can use to define different hierarchies within countries. I created the country-levels project from Natural Earth dataset, but I'm not happy with it as for example it contains bad Admin 1 divisions in Spain.

Can you help us with your experience? My aim is to make a short code based system, like hasc:ES.CL or something similar which we can use to refer to a region. We'd need to have a GeoJSON for each region + a Wikidata link for population fetching. Is this possible somehow with your project?

Here is a relevant issue which I've opened, if you could contribute to the discussion it'd be great!
https://github.com/lazd/coronadatascraper/issues/286

cosmogony parameters

Hi,

Years ago, I have been using cosmogony with the parameters filter-langs and country-code They are not recognized anymore by the latest version. I have not found any relevant information on that matter.

cosmogony --filter-langs fr en --country-code FR -i FR/FR.osm.pbf -o FR/FR.2.osm.json

Could someone be nice enough to help on that matter ?

Cheers

Philippe

Unable to build cosmogony as described into README.md + Pull Request

Unable to build cosmogony as described into README.md + Pull Request

hznor@local:~/mimirsbrunn/cosmogony$ cargo build --release

[...]

  Compiling termcolor v1.1.3
   Compiling heck v0.4.0
   Compiling geo-types v0.7.7
   Compiling flat_map v0.0.10
error: expected `,`, found `.`
   --> /home/hznor/.cargo/registry/src/github.com-1ecc6299db9ec823/geo-types-0.7.7/src/geometry/geometry_collection.rs:111:25
    |
111 | #[deprecated(since = 0.7.5, note = "Use `GeometryCollection::from(vec![geom])` instead.")]
    |                         ^ expected `,`

   Compiling smartstring v0.2.10
   Compiling clap_derive v4.0.13
   Compiling clap_lex v0.3.0
error: could not compile `geo-types` due to previous error
warning: build failed, waiting for other jobs to finish...

This typo has since been fixed in the repo: georust/geo.
Incriminated line 111:

Now corrected by:

The solution seems to be to upgrade the geo-types dependency to version 0.7.8.

I have sent a Pull Request to address this issue.

Bug : Canada is not recognized correctly

Hi,

First of all, thanks for this great piece of software.

I may have encountered a bug while working on some partial OSM data set.
I am using geofabrik.de which provides the OSM data per country
http://download.geofabrik.de/

Cosmogony is working well for France, Belgium and Germany.
But for Canada it is not working properly.
http://download.geofabrik.de/north-america/canada-latest.osm.pbf

The shape (the geometry if you prefer) of Canada is not computed correctly.
Displaying this shape shows only the Haïda Gwaïi archipelago and not the main land !
As a direct consequence the hierarchy is not computed correctly

For example, It computes
Mount Royal, Montreal (06), Quebec
But it should compute
Mount Royal, Montreal (06), Quebec, Canada

Any idea how to fix it ?

Best regards,

Approximate polygons for areas/levels where they are missing

The Problem

There are a lot of areas around the world where it is close to impossible to obtain a proper partition of a country with available polygon data for all levels (especially city and below).

On the other hand, OSM has an abundance of city nodes for example.

How we can solve this

A first order approximation of the polygons for levels where OSM has a good coverage of nodes could be done via a Voronoi Diagram (https://en.wikipedia.org/wiki/Voronoi_diagram).

In practice

For preliminary tests, Postgis offers the ST_VoronoiPolygons (https://postgis.net/docs/ST_VoronoiPolygons.html) function.

Example output applied to a set of city nodes:
screen shot 2018-01-30 at 10 39 17 pm

This method would need to be refined and robustified to adapt to existing admin levels of higher level and situations where data is partially available (and other scenarios that might emerge).

Quality Assurance

Some ideas to test the quality of our dataset:

Non closed boundaries
We need to log the list of the boundaries that could not be imported because they are not valid polygon / multipolygon

Hierarchy coherence

  • an object should never be bigger than its parent
  • child zones of a region should not overlap

Coverage stat and tests
By country statistics:
Compute the geographical coverage in states, cities, etc. (example: 88% city coverage, which means that 88% of the country territory is inside a city)

Persist expected values and test them in CI:
for example:

  • Each country must have a 100% state coverage
  • France country must have at least 99% city coverage
  • etc

Volumetric stat and tests
Stat:
same as below, but only raw numbers, without geographical concerns (example: Australia country has 17 states)

Test:

  • Australia should have 17 states
  • Australia should have between 1600 and 1700 cities
  • etc

Expected values for each country must be in a config file (CSV, YAML ?) and not inside the code source, so that anybody can update it if needed.

Postgis output

To make it easier to reuse, we need to add the ability to persist our cosmogony to a postgis database.

Tests failed in cosmogony run

Hi, I hope you're all doing well!

I have tried building and launching my own Cosmogony, because I have my own genre of this project(https://mappumappu.000webhostapp.com/index.html), albeit on a smaller scale so it really picked my interest!
But when I try to do a full flow of the cosmogony process
(1 - Fetch an ...osm.pbf from Geofabrik;
2 - Build a .json with the Cosmogony project;
3 - try to run Cosmogony Explorer with the .json.)
I get an error in the tests assertion part. I have tried with the Portugal-latest and Poland-latest osm.pbf files. It does, however, build the Explorer with the files already in the Cosmogony project(for example the luxembourg_filtered.osm.pbf in the cosmogony\tests\data folder), although it shows nothing besides the UI of the explorer(don't know if that's the expected behaviour). Do I need to add the .osm.pbf file to the ../tests/data so it knows what to expect?
I don't know what I can do to fix it so all help would be appreciated.

P.S.: There are two images attached, the first with the result of cosmogony json build and the second with more info of the error.
Also, the commands I'm running are:

cargo run --release -- generate -i portugal.osm.pbf

PATH_TO_COSMOGONY_DIR=../cosmogony_data pipenv run inv -e run-local --cosmogony-file-name=cosmogony_Portugal.json --build-dockers

(cosmogony build)
result_build_cosmogony

(error in Docker run)
error_run_json_cosmogony

industrial and retail areas

There are areas that are marked as landuse=industrial or landuse=retail and it would be classify POIs inside one of them really inside that area. For example there's a retail area that is not inside any city and if I want to search for a McDonald's inside of it in openstreetmap.org I can search for McDonald's Gan HaZafon but I can't do it in mimir.

Bruxelles region not shown

When I browse to http://cosmogony.world/#/7.59/4.26/50.80/168112 and I switch to "Country region", I get an empty region for Bruxelles (normally, it's admin_level=4) but I see the two other regions Flanders and Wallonie (both admin_level=4)
See 1st screenshot
selection_905
Then, I choose "State". I see Bruxelles area not empty and when I click on "Bruxelles", I get the polygon with admin_level=4 I was expecting one level above (at "Country region"). See 2nd screenshot
selection_906
It doesn't seem normal. Can you confirm it's an issue?

Incorrect classification for Israel

Israel have a complicated way to define the administration borders so I want to ask how to handle it before opening a PR.

Israel as a country have the following hierarchy:

  • Country (Israel)
  • Districts (Northern District, Southern District, etc), they are shown as states in cosmogony.
  • Council (City council, local council and regional council), they are mapped as cities.
  • Villages which are only under a regional council

So basically admin_level=8 can mean either a city council/local council which are actually a city but they can also mean a regional council which is a collection of villages or kibbutzim.

Sometimes the council will be marked as place=city/town and sometimes inside the council area there would be another area that is tagged as place=city/town/village.

Maybe using the place tag would be better for cities and the council would be for state district or something like that?

Merge multiple non overlapping cosmogony files

To generate a cosmogony of the world, you currently need a lot of ram since the whole planet.osm.pbf needs to be loaded.

We could read split non overlapping osm files, without a shared parent (eg. split by continent or country) and merge the generated cosmogony.

Since there is no contract on the indexes (id and parent) being consecutive, it seems a dumb offset by files, so each files will have ids in a separate range, will do the job.

I have a first shot at this, and need to test it on real cases but do you have any thoughts on this (I haven't worked on cosmogony for a while, maybe I'm completely off the mark)?

ping @prhod @amatissart

Transfers this repository to another organisation ?

This repository has been created in the QwantResearch organization by convenience, should we move it to another organization ?

And if we do it, should we create a Cosmogony organization or should we use an existing one ?

Segmentation fault

I'm getting a segfault when executing on the current weekly OSM planet.pbf (i.e. exported 2021-06-26) using the github master.

No problems when executing on a limited dataset (sweden.pbf from GeoFabrik).

Any suggestions?

I'll try again as soon as the next weekly planet.pbf is released.

[2021-06-30T20:26:00Z INFO  cosmogony_builder::additional_zones] Ignoring place with id node:4295699280 and country relation:2202162 as parent
[2021-06-30T20:26:00Z INFO  cosmogony_builder::additional_zones] Ignoring place with id node:4295699281 and country relation:2202162 as parent
[2021-06-30T20:26:00Z INFO  cosmogony_builder::additional_zones] Ignoring place with id node:4318991429 and country relation:2103120 as parent
[2021-06-30T20:26:03Z INFO  cosmogony_builder::additional_zones] We'll compute voronois partitions for 14005 parent zones
intersection failure: impossible to build a geometry from a nullptr in "Geometry::intersection
Last error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 11.341666499999999 8.7999999999999989 at 11.341666499999999 8.799999999999998" (Unknown GEOS error)
intersection failure: impossible to build a geometry from a nullptr in "Geometry::intersection
Last error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 11.341666499999999 8.7999999999999989 at 11.341666499999999 8.799999999999998" (Unknown GEOS error)
intersection failure: impossible to build a geometry from a nullptr in "Geometry::intersection
Last error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 11.550000000000001 8.3166669999999989 at 11.550000000000001 8.316666999999998" (Unknown GEOS error)

...

extrude_existing_town: difference failed for node:150968654: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Nested shells at or near point -121.29731 38.243600000000001 at -121.29731 38.24360000000000")
extrude_existing_town: difference failed for node:29941752: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Nested shells at or near point -89.753559199999998 43.144259999999996 at -89.753559199999998 43.14425999999999")
extrude_existing_town: difference failed for node:1178614100: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 133.2864505703993 44.099531000885996 at 133.2864505703993 44.09953100088599")
extrude_existing_town: difference failed for node:1346990699: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 133.2864505703993 44.099531000885996 at 133.2864505703993 44.09953100088599")
extrude_existing_town: difference failed for node:1080033707: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 133.2864505703993 44.099531000885996 at 133.2864505703993 44.09953100088599")

...

Last error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 4.0499999999999998 7.8749999999999991 at 4.0499999999999998 7.874999999999999" (Unknown GEOS error)
intersection failure: impossible to build a geometry from a nullptr in "Geometry::intersection
Last error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 4.0499999999999998 7.8749999999999991 at 4.0499999999999998 7.874999999999999" (Unknown GEOS error)
intersection failure: impossible to build a geometry from a nullptr in "Geometry::intersection
Last error: TopologyException: Input geom 0 is invalid: Self-intersection at or near point 3.7916669999999995 7.9249999999999998 at 3.7916669999999995 7.924999999999999" (Unknown GEOS error)
Failed to compute voronoi for parent relation:3720587: impossible to build a geometry from a nullptr in "Geometry::voronoi
Last error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 4.3749999999999991 7.9499999999999993 at 4.3749999999999991 7.949999999999999"
extrude_existing_town: difference failed for node:150963743: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Nested shells at or near point -122.78478179999999 45.340228699999997 at -122.78478179999999 45.34022869999999")
extrude_existing_town: difference failed for node:4571365408: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 107.95787713017623 26.570585016953814 at 107.95787713017623 26.57058501695381")
extrude_existing_town: difference failed for node:4571483306: NoConstructionFromNullPtr("Geometry::difference\nLast error: TopologyException: Input geom 1 is invalid: Self-intersection at or near point 107.95787713017623 26.570585016953814 at 107.95787713017623 26.57058501695381")
Segmentation fault (core dumped)

Answer the question: where am I?

When looking at a map, it would be nice to know what I am looking at.

I see two possible approaches for now:

By viewport

The area that covers the largest % of the screen (and less that 90%, otherwise it would always display « Europe »).

This requires that one displays a map, which is not always the case.

A geocoder that allows to search a restaurant needs to know where the restaurant is.

The nice thing is that it would show Central Park instead of Manhattan.

By a tag on the administrative level

It could be computed from the size instead of the administrative level. This would allow to handle differently Tokyo (2100km²) and Paris (100km²).

This tag could be attached to profile of who is asking.

Duplicates finder (tool suggestion)

Duplicates finder

Given a .pbf, an administrative level, we want to find any duplicate.

For instance any way having the same name many kilometers away within the same city. This would help to know a focus should be given on a specific area to avoid any ambiguity for a geocoder

Generating .json from planet.osm.pbf results in process being 'killed'.

When I try to generate a .json file based on either unfiltered or filtered by boundaries/admin_level planet.osm.pbf the process gets 'killed' after a while. I'll leave more info below and attached in the image.

I'm using WSL2 and the steps I'm doing are just:
1 - Get planet.osm.pbf file(in my case I got it from the torrent version);
2 - Filter pbf file by either boundary or admin_level, which would result in a considerably smaller file. Additionally I'm also filtering some other info out. The commands I'm using are these:
osmium tags-filter planet-latest.osm.pbf r/admin_level --overwrite -o planet-admin.osm.pbf
osmconvert planet-admin.osm.pbf -o=planet-admin.osm
osmfilter planet-admin.osm --drop-tags="barrier= building= highway= landuse= office= place= waterway=" -o=planet-admin-noplace.osm
osmconvert planet-admin-noplace.osm -o=planet-admin-noplace.osm.pbf;

3 - Build cosmogony with command:
cargo run --release -- generate -i ../../planet-admin-noplace.osm.pbf
It's in this command that the process is getting "killed".

error_gen_json_from_planetosmpbf

(Note: This happens with unfiltered planet.osm.pbf as well. I also don't know if this could be related to #118 ...)

Explore the cosmogony

We will need a visualization tool (or maybe several tools ?) for our day to day usage of cosmogony.
The purpose of this issue is to gather our needs. Then we may summarize it in the readme of this repo.

Visual coverage by zone type
we want to explore the world on a map, select an zone type and see the existing zones of this type on the map to get an idea of the coverage
a POC has been done in this repo, PR : osm-without-borders/cosmogony_explorer#1

View zone metadata
we want to select a zone and get all its metadata (names in different languages, wikidata id, etc)

All zones containing a point
we want to click the map and see all the zones including the point

Explore the hierarchy
we want to select a zone, and get an idea of its hierarchy :

  • see all its parent zones
  • see all its direct child zones
  • see all its child zones, cascading the hierarchy
  • see its other linked zones

Download some zones
we want to select some zones (selecting from the map and/or using the hierarchy) and download them in a GIS friendly format (at least geojson, with metadata as properties).

Quality assurance
We want some dashboard with the coverage tests results described in issue #4

India and Bangladesh are missing when running on planet.pbf

I think this may be an issue in https://github.com/Qwant/osm_boundaries_utils_rs but I'm reporting here to bring it to your attention, I'm happy to close it if you think it should only exist there.

I realized that ZoneType=Country zones were missing for India and Bangladesh. In fact, it seems much more is missing, possibly because the larger country boundary is missing?

I'm reporting this not as a matter of pedantic precision but with the intention to alert you that it seems this entire area (India/Bangladesh) is missing.

image

Note however that the flanking islands are present:

image

For Bangladesh, the only part that is present is this little chunk (osm id: relation:3921211, specifically this chunk, name: রংপুর বিভাগ). Interestingly, I think it is precisely one of those enclaves (not sure)

image

I think it may be related to many enclaves being present between India and Bangladesh. See this wikipedia article, in fact there are even counter-counter enclaves, see this diagram on wikipedia:

image

The interesting thing is that southern Netherlands at the border with Belgium there are some enclaves as well but those seem to be handled fine (Yellow is the Netherlands and green is Belgium). Maybe it has to do with there being counter and counter-counter enclaves in India/Bangladesh? Wikipedia seems to claim they both create the only counter-counter enclave in the world.

image

I think this is just not supported right now?

https://github.com/Qwant/osm_boundaries_utils_rs/blob/master/src/boundaries.rs#L143-L144

I did notice that for example has a role of subarea within Bangladesh, and subarea is not in the list of roles to extract here, might it be as simple as adding it?

Ontology starting point

Here's the issue to start the discussion about schema of our zones "hierarchy".

The aim of this issue is to fill the concerned section in the README

here are my non structured thoughts:

categories

I like libpostal categories, libpostal is quite a reference in the address parsing world and we can hope their categories can handle all the countries specificities all around the world, but I don't think it handles all the corner cases (and it's not the only category out there, for example Wof uses another).

libpostal does not handle non administrative regions apart from the suburb (and maybe the country_region). So it would be difficult to represent Marne-la-Vallée or parc du mercantour

There is also the question of postal codes. I don't know whereas we could/should have postal codes zones in the hierarchy (should we create a separate issue for this ?)

Pyramidal hierarchy or graph-based ?

Can a zone have at most one parent or can it have several.

I fill that it might be a failing of Wof to have a pyramidal hierarchy. I don't think it will complicate cosmogony that much to be able to have several parents.
I don't think it's useful for purely administrative regions (but maybe there are countries where it's relevant), but for non-administrative regions I think a pyramidal hierarchy will be too restrictive.

Eg. what would we link Marne-la-Vallée to ? ile de france ? but then it would be difficult to link it back to the cities that are part of it.
The same apply for non official suburbs that can span across several district

links coherence

Wof hierarchy is nice, but being linked to all parents brings incoherence (like france empire that contains france country but the empire has less descendant than the country.
I fill like outputting only the first level of relationship force the dataset to be coherent (even if so it will make the dataset harder to use without tools)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.