Giter Club home page Giter Club logo

osm-tiler's Introduction

osm-tiler

Build Status

osm-tiler is an .osm.pbf tiler for efficiently breaking OpenStreetMap planet files into smaller chunks. It is ideal for distributed processing of OpenStreetMap data, since it groups spatially related data into separate files that do not need to be loaded into memory at once.

Use

osm-tiler planet.osm.pbf -z 7

Build

osm-tiler uses https://github.com/mason to fetch dependencies to ensure a uniform development and build environment.

make osm-tiler

Test

The following command will download necessary sample files, rebuild the binary, and run osm-tiler against an OSM extract.

make test

Download

Downloads a small metro extract for testing osm-tiler.

make chs.osm.pbf

Clean

Deletes the osm-tiler binary.

make clean

osm-tiler's People

Contributors

rclark avatar shwetha-mc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osm-tiler's Issues

geojson => json

  • remove -g and --geojson flags
  • add -j and --json flags
  • document new json format in readme
  • json writer for nodes using RapidJSON
  • json writer for ways using RapidJSON
  • json writer for relations using RapidJSON

cc @rclark

what is osm-split?

osm-split is intended to be a C++ tool using libosmium for efficiently splitting an OSM planet file into tiled chunks of either line-delimitted geojson or osm.pbf files, similar to most metro-extract tools. This allows the single-threaded memory-hungry operations required for processing the planet to be run only once, allowing mapreduce tools to perform additional transformations in JavaScript.

osm-split will make a couple subjective decisions about the data that any processor could live with. For example, only OSM objects that can be joined to a node's lng/lat will be output. In the case of turn-restrictions, this would mean building an intermediate geometry from the from/via/to objects so that they can be tiled. Beyond these basic spatial operations necessary for indexing, osm-split would not perform any other filtering of tags, which allows a wide array of tile-reduce processors to use the data however they need to.

Why not use existing extract tools?

Most of the existing extract tools allow for arbitrary polygons to be used to define areas of interest. This presents memory and computation challenges that we can easily sidestep with our use case, which is restricted to tiles. Generating a unique list of tiles that an OSM object overlaps with is significantly cheaper than performing the same operation for an arbitrary shape like a city boundary. Also, having tight control over how fuzzy OSM objects like relations are handled should give us a bit more flexibility over general-purpose tools like minjur.

next steps:

cc @rclark @camilleanne

benchmark performance and minimum specs

  • How fast can we split a metro area?
  • How fast can we split a large-ish region?
  • How fast can we split the planet?
  • What are the minimum machine specs for each of the above?

The first pass will use leveldb, which should only be limited by disk space. If speed is insufficient using this method, we can use libosmium's in memory node-cache instead. If we use the in memory cache, processing will likely be faster, but require a very large machine.

cc @rclark

tile indexing

An OSM object should be included in any tile extract if any of its nodes are contained in that tile. We can vendor the tilebelt implementation of this tile labeling operation.

This approach will be extremely fast (~10M nodes per second in JavaScript and possibly much more in C++), and provide "good enough" tile indexing. Where it will fail is when an edge crosses over a particular tile, without any nodes being in that tile (like at a corner). I think we can overlook this for the types of data analysis we are concerned with. If we find that we realllly don't want to make the tradeoff, we can implement tile-cover in C++, although that would be a bigger lift.

An open question is what we should do with giant geometries like coastlines that span large numbers of tiles. I would be in favor of logic that says something like "if an object touches >100 unique tiles, bail and do not include it". Coastlines and other giant objects do not fit the TileReduce scaling model well, and are not needed for any types of analysis we need so far. If a particular processor does need these, it can handle these geometries on its own.

cc @rclark

tests

The current "test" simply runs osm-tiler on a small metro extract, and passes as long as it does not crash. Let's add:

  • unit tests - tile conversion, directory creation, required flag exceptions, etc
  • integration tests - verify that data gets tiled where it needs to be in the appropriate

When these are in place, let's also enable tests on travis.

cc @springmeyer

document performance shortcuts

We are intentionally building this tool to favor speed over completeness. We are also intentionally building it to work for the specific use cases presented by mapreducable network analysis. This means that we can and will take shortcuts that a more general purpose tool might go to great lengths to avoid. We should document the tradeoffs explicitly to set expectations.

Off the bat, we will need to document:

  • tile index methodology and gaps
  • large object omissions
  • relation omissions (or the flip side: which do we support?)

output to geojson

Output to line-delimitted geojson. The output would be structured like:

./output/{{X}}/{{Y}}/{{Z}}/nodes.json
./output/{{X}}/{{Y}}/{{Z}}/ways.json
./output/{{X}}/{{Y}}/{{Z}}/relations.json

output to osm.pbf

Similar to the geojson output option, but this would output a complete osm.pbf tile for each tile. The data would be indexed using the same method as any other case, but the geometries constructed would be tossed out after the osm.pbf subset was created. We can lean on libosmium for writing and compressing this data. The structure would look like:

./output/{{X}}/{{Y}}/{{Z}}/data.osm.pbf

These extracts would take a bit more work to use in processors, but would allow for maximum flexibility using node-osmium.

arg parsing

This should cover all the parameters I am aware of:

Usage:
  osm-split [options] <file>

<file> : osm.pbf file to process
--zoom -z : zoom level of tiles
--pbf -p : output tiled osm.pbf files
--geojson -g : output tiled line-delimitted geojson files

build fails on Ubuntu

error is:

g++ osm-tiler.cpp -o osm-tiler -isystem./mason_packages/.link/include -L./mason_packages/.link/lib  -std=c++11 -fvisibility=hidden -g -Wall -Wextra -Wfloat-equal -Wundef -Wcast-align -Wwrite-strings -Wlong-long -Wmissing-declarations -Wredundant-decls -Wshadow -Woverloaded-virtual -O3 -DNDEBUG  -lz -lpthread -lboost_program_options -lboost_filesystem -lboost_system;
/tmp/ccNL9CFc.o: In function 
`boost::program_options::validation_error::validation_error(boost::program_options::validation_error::kind_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)':
/home/dev/projects/osm-tiler/./mason_packages/.link/include/boost/program_options/errors.hpp:373: undefined reference to `boost::program_options::validation_error::get_template[abi:cxx11](boost::program_options::validation_error::kind_t)'
/home/dev/projects/osm-tiler/./mason_packages/.link/include/boost/program_options/errors.hpp:373: undefined reference to 
`boost::program_options::error_with_option_name::error_with_option_name(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)'

I'm totally clueless with these dependencies. Wondering if I'm missing a required package

run on morec2

Let's do a planet run on a morec2 to see how it performs with a realistic dataset.

Use osmium::geom::Tile?

Libosmium has a class osmium::geom::Tile that already does a lot of what is implemented in osm-tiler/handler.hpp.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.