simonpoole / mapsplit Goto Github PK
View Code? Open in Web Editor NEWA fast way to split OSM data in to a portable tiled format
License: Creative Commons Zero v1.0 Universal
A fast way to split OSM data in to a portable tiled format
License: Creative Commons Zero v1.0 Universal
I am not sure I am following the docs well enough (they are not that clear) but I downloaded a map from here:
https://download.geofabrik.de/south-america/chile.html
and then I ran:
java -Xmx6G -jar mapsplit-all-0.4.0.jar -tvMm -i chile-latest.osm.pbf -f 2000 -o chile1.msf -z 15 -O 2000 -s 200000000,20000000,2000000 -p chile.poly
But when I try to load the resulting file in Vespucci, I get an error "Not a MBTiles file format". Is there further modification needed to this command? What am I doing wrong?
I try to split romania pbf file from geofabrik i get an out of memory message.
G:\Maperitive-latest\Programe\mapsplit>java -jar mapsplit.jar romania-latest.osm
.pbf romania
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoa
der.java:58)
Caused by: java.lang.OutOfMemoryError: Java heap space
at HeapMap.(HeapMap.java:56)
at MapSplit.(MapSplit.java:116)
at MapSplit.run(MapSplit.java:846)
at MapSplit.main(MapSplit.java:1053)
... 5 more
How can I solve it?
Best publish to github releases
Currently (February 2021) the Geofabrik Europe extract is 2710803032 nodes, which is more than 2^31. Mapsplit consequently can't split it.
@tordanik FYI as it seems that the org @imintel that developed mbtiles4j is dead, I'm intending to properly fork it and make it available on maven central see simonpoole/mbtiles4j#1 except if you want to do so.
If this could be made generic enough it probably should actually be done in https://github.com/simonpoole/mbtiles4j
The current updating mechanism doesn't really work. It should be replaced by at least one of the following:
Currently optimization uses standard Collections in the data structures, this could be optimized (avoiding auto-boxing/unboxing) by using suitable primitive variants, for example from https://fastutil.di.unimi.it/
Currently the tile encoding
` 6 4 3 2 2 2
3 7 1 6 4 3
XXXX XXXX XXXX XXXX YYYY YYYY YYYY YYYY 1uuu uNNE nnnn nnnn nnnn nnnn nnnn nnnn
X - tile number
Y - tile number
u - unused
1 - always set to 1. This ensures that the value can be distinguished from empty positions in an array.
N - bits indicating immediate "neigbours"
E - extended "neighbour" list used
n - bits for "short" neighbour index, in long list mode used as index
limits the number of elements that can have an extended tile list (that is that need to be copied in to more than just nearby tiles) to 2^24-1 (16'777'215). An easy fix for this, that would likely suffice for a while would simply be to rearrange the encoding as
` 6 4 332 2
3 7 109 7
XXXX XXXX XXXX XXXX YYYY YYYY YYYY YYYY 1ENN nnnn nnnn nnnn nnnn nnnn nnnn nnnn
As the NN bits are not used when the extended lists are in use (that is when E is set), this gives us an extra 6 bits without having to change the logic at all. This would allow to index 2^30-1 (1'073'741'823) extended tile lists. @tordanik comments?
Some tiles contain ways without referenced nodes. This seems to be the case when they are part of a multipolygon relation, but do not have nodes inside the tile themselves.
The "completeRelations" parameter is not set (defaults to false), so I would expect the relations to be incomplete, not the ways. And I read the Wiki description this way too.
An example is tile 4308_2856. It contains the incomplete ways 29433456, 29433458 (inners of 62921) and 77004276 (inner of 1174277).
@tordanik there are a number of trivial code smells as a result of the PRs I just merged, didn't make sense to fix them inflight and mess up any work you are currently doing so I left them in. See https://sonarcloud.io/project/issues?id=mapsplit&resolved=false&sinceLeakPeriod=true&types=CODE_SMELL (I wouldn't fret about the code complexity ones, the limit is so low that it is often not reasonable to try and meet it).
Multipolygon relations and their members should be contained in tiles that are completely covered by the multipolygon area, even if they do not contain nodes from any member way.
Copy and rehash when fill factor is exceeded.
Running mapsplit on a 971MB file (pbf of The Netherlands) doesn't seem to finish:
$ ulimit -n
4096
$ mapsplit/mapsplit -v --fd-max=4096 `pwd`/netherlands-latest.osm.pbf `pwd`/out
No datefile given. Writing all available tiles.
Reading: ~/t/netherlands-latest.osm.pbf
Writing: ~/t/out
3000000 nodes processed
6000000 nodes processed
9000000 nodes processed
12000000 nodes processed
15000000 nodes processed
18000000 nodes processed
21000000 nodes processed
24000000 nodes processed
27000000 nodes processed
30000000 nodes processed
33000000 nodes processed
36000000 nodes processed
39000000 nodes processed
42000000 nodes processed
45000000 nodes processed
48000000 nodes processed
51000000 nodes processed
54000000 nodes processed
57000000 nodes processed
^C
Whatever I try, after 57M nodes, mapsplit seems to hang (I waited for 5+ hours).
Increasing max open files to 65536 didn't seem to make a difference.
Is there anything I can do to get this to work?
@PedaB was the intent of https://github.com/simonpoole/mapsplit/blob/master/src/main/java/dev/osm/mapsplit/MapSplit.java#L250 that it would include the way in question in all tiles that potentially intersect with it? Because, maybe I don't understand it, but it would seem as if it includes far too many tiles.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.