Giter Club home page Giter Club logo

taghistory's Introduction

OSM Tag Usage Analysis

Generates graphs of the usage of arbitrary OSM tags over time (with daily granularity) by number of OSM objects. Read more about OSM tags and this tool in my blog article.

Warning: The output is (currently) only given in terms of numbers (counts) of OSM objects! Similarly to some of the statistics produced by taginfo, it is subject to the same limitations, most notably the effect that one cannot directly compare the number of tags used for different linear and polygonal features such as roads, land cover, etc. because such features are typically divided up into many OSM objects of different sizes. For example, an existing road may be divided up into two pieces when a new turn restrictions is added, resulting in that the count of each of the tags used on the road (even obsolete ones) is increased by one in the OSM database. That means that one needs to pay close attention when comparing tags that are typically used on such features, even when one's comparing subtags that are typically used on the same kind of parent object (e.g. different values of the highway tag).

Technicalities

A simple osmium script (see /backend/) scans through a planet history dump and aggregates daily net differences of all used tags by osm object type (node/way/relation). This takes care of created and deleted objects as well as tag modifications in between different versions of an object.

This data is then stored into an sqlite database and is exposed by a simple REST API.

API

Taghistory's own API is quite limited and currently not updated regularly. Please consider using an alternative like the Taginfo chronology API or the ohsome API instead.

Data returned from the taghistory API described below is available under the terms of the ODbL 1.0 license and copyright © OpenStreetMap contributors.

GET /<type>/<key>/<value>[?format=<format>]

Gets the osm object count history for the given tag (key, value) of the given OSM object type (type). type can be *** to search for any object type.

Returns a JSON array of objects containing date, delta and count fields indicating the net change (delta) of the respective tag usage on a particular date. count contains a running cummulative sum of the object counts for convenience.

The optional GET-parameter format can be one of json (default value if omitted, see the example below) or csv for a downloadable csv file.

Example: http://taghistory.raifer.tech/relation/amenity/drinking_water

[
  {"date":"2010-08-03T00:00:00.000Z","delta":1,"count":1},
  {"date":"2010-08-28T00:00:00.000Z","delta":3,"count":4},
  …,
  {"date":"2014-10-28T00:00:00.000Z","delta":-3,"count":0}
]

GET /<type>/<key>[?format=<format>]

Same as above, but matches any tag with the given key.

todos

  • implement regular (e.g. daily) data updates
  • for linear and polygonal objects: use length or area as a metric instead of object count

see also

taghistory's People

Contributors

tilmanb avatar tyrasd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

taghistory's Issues

backend code cleanup

the backend code here is still buggy, incomplete and missing documentation. todos:

  • add dev readme w/ install instructions
    • add instructions for how to import csv data into sqlite db (.mode csv, .import …)
  • fix missing escaping of " characters in tag strigns of csv output
  • add server script that implement the taghistory api

related(?):

  • cut output from last processed day (which is typically incomplete)

Correlate sum with count of distinct users

This is a wishlist item - perhaps you can see value in it: (I cannot assess, if it is in scope of taghistory at all…)

When considering taghistory a popularity contest, that shows the count of votes (i.e. objects carrying a specific tag), where every voter has unlimited votes, it would be nice to also have a count of voters (the people that applied a tag to an object).

For little or medium used tags, one could immediately see, if it was applied (voted on, in contest speak) by a single, a few or by many. For tags with many occurences, the distribution will be quite flat, but it might still be reasonable to have the number, when comparing tags that are close in meaning, e.g.

Hope it is clear :)

Less popular tags do not have history after mid-2018?

It seems that tags with less than about 1000 uses contain history only to about mid-2018, not after that.
After 2018, only dashed line to current value is shown.

Is that intentional? If so, it should be documented.

e.g. https://taghistory.raifer.tech/#***/shop/fishmonger&***/amenity/gambling&***/amenity/crematorium&***/amenity/exhibition_centre&***/amenity/emergency_service&***/amenity/vehicle_ramp&***/shop/hobby&***/shop/vacuum_cleaner produces:

taghistory_min

I found related only this diary comment, but it seems to talk about taginfo history being limited to more popular values, not taghistory.

  • There are related strangeness in taghistory database in that mid-2018 timeframe too, e.g. #33

More options such as specific region?

Can you add any option to your project to let people add a specific region for their tags?
For example I would like to see nodes with tag "shop=tobacco" only in Italy.

Do you think is it possibile to implement?

I hope you can help me. Keep up the good work.

data updates

from https://github.com/tyrasd/taghistory#todos: implement regular (e.g. daily/weekly) data updates. Possible solutions:

  • reprocess the weekly full history dumps (maybe only all keys as a limited subset of tags to limit memory requirements?)
  • use Overpass augmented diffs to calculate daily diffs (currently not feasible because of performance issues, see drolbr/Overpass-API#322, possibly fixed by drolbr/Overpass-API#342)
  • use Overpass augmented diffs by aggregating standard minutely diffs (currently not feasible because of drolbr/Overpass-API#346)
  • download daily planet dumps and calculate tag usage deltas
  • maybe someone out there already runs a global, daily (or quicker) updated osm database which could generate daily tag count diffs as a side product of their update process. see also https://www.openstreetmap.org/user/tyr_asd/diary/39402#comment39983
  • compare daily results from taginfo

UI/UX Improvements

  • a way to remove unwanted tags/traces #14
  • make checkable items more obvious (reset zoom)
  • permalinks via url-hash
  • better UI for any-value selection

Make zoom UI more obvious

It is really hard to zoom in a predictable manner at the moment. This is especially true with a touch device. Adding some kind of touchable indicator would help a lot.

Allow export of data

Allow the export of the raw data, e.g. as csv would be awesome. Even better: API for it

Enable direct links to the graphs

Allow to set parameters via the URL. So links could be shared. Alternatively create a hash for a given configuration which could be shared.

taginfo integration

Let's integrate the backend&api into taginfo.

  • adapt backend preprocessing to fit into taginfo project structure
  • produce sqlite files directly (instead of via intermediate csv files)
  • (?) slimmer sqlite db by having a simpler db schema: (type, key, [value,] timestamps/days, deltas) with only 1 entry per tag
  • re-implemt taghistory api in as taginfo api endpoints
  • make sure taginfo pages are accessible (e.g. found in search results) also for tags which currently have 0 occurences in the db (e.g. foo=no)
  • add history tab with simple graph in taginfo
  • (?) implement history graph in tag comparison page in taginfo
  • switch this repo to use taginfo's api
  • document downloading/processing of .osh.pbf files in taginfo (for regional instances -> refer to geofabrik extracts)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.