Comments (10)
Huh, this could be very interesting. Really what we need is something of a hybrid @miccolis @ericfischer monster that could build something like:
- A watchbot task that takes changesets, derives metrics of them, and pushes them into a redshift cluster. A potential layout would be like a table of timestamp-user-userstats
Beyond this, I have some UI tweaks to make to make more advanced charts possible but for the most part this is a matter of getting the data in place.
from osm-edit-report-deprecated.
Following. cc @rclark
I'd love to see an OSM => S3 changesets mirror (would love to use this for other stacks rather than hitting OSM directly), and whenever an object is pushed to this s3 bucket it could trigger s3 notifications or lambda for redshift processing.
from osm-edit-report-deprecated.
- How many records are there each day in the changesets?
- How many changeset files are published each day?
- Is there any notification / feed / etc. for when a new changeset file is published currently?
- @lxbarth - would it be useful to be able to do one-off / adhoc queries of the last 24 to 48 hours of data, on a daily basis?
from osm-edit-report-deprecated.
@ianshward - at a minimum we're talking about mirroring the replication feeds in these directories:
http://planet.osm.org/replication/
How many changeset files are published each day?
- One per minute, hour, day for data updates
- One per minute for changesets
Is there any notification / feed / etc. for when a new changeset file is published currently?
There's no notification when replication files are updated, you'd poll them.
whenever an object is pushed to this s3 bucket it could trigger s3 notifications or lambda for redshift processing.
This is why the s3 notifications are interesting.
@lxbarth - would it be useful to be able to do one-off / adhoc queries of the last 24 to 48 hours of data, on a daily basis?
No, at least not within the scope of a replication mirror.
@ianshward - could we run this as a public service for anyone looking for an on-AWS mirror? This is interesting because I'd like to be able to share any code we produce supporting the data team as openly as possible. If we build on proprietary services sharing our code becomes less useful.
from osm-edit-report-deprecated.
I have code already written to poll http://planet.osm.org/replication/ for changes in dynamosm. When it finds new minutely files there it creates SNS notifications that a file is ready to be downloaded.
If this is broadly useful I can pull that code out of dynamosm into a standalone service. Systems could listen to this SNS in order to grab files from planet.osm.org when they're ready, or this service could fuel a full-blown mirror.
from osm-edit-report-deprecated.
Our OSM mirror is alive and mirroring changesets onto S3. I got it running to see if it works, but we don't need to keep it running if no one is ready to consume it yet.
If we build on proprietary services sharing our code becomes less useful.
The changesets on S3 are currently publicly accessible for download, but the notifications are simply pushed to SNS topics that require adequate permission to subscribe to.
from osm-edit-report-deprecated.
@rclark nice!
@lxbarth @rclark @Rub21 if adhoc querying is not interesting, then I wonder whether it'd make sense to run a service which does something like:
- parses the .osc
- transforms it to JSON
- unloads it to S3
I'm not sure if it's interesting or even practical that the unloaded data be all the data, but, maybe the parts that we wrap in such a service are already open source or could be written as such, and there'd be a facility to unload data related to only a specified list of users.
I don't know enough about the data in .osc to know if possible, but could you also create any interesting vector tiles out of it?
from osm-edit-report-deprecated.
Ahh, I left out, that once you have days, months or years of these JSON files in S3, you could then suck them into Redshift w/ something like metrics-warehouse, and if you wanted, do adhoc querying from StickShift that way. This assumes you're happy with whatever attributes you've included in the JSON files.
from osm-edit-report-deprecated.
could you also create any interesting vector tiles out of it?
@ianshward sounds like you're coming close to describing what dynamosm is intended to do -- transform this stream of .osc into a stream of geojson, basically.
You may or may not have to go that far to extract metrics from osc files, it all depends on what exactly the metrics are that you want to collect.
from osm-edit-report-deprecated.
Stale.
from osm-edit-report-deprecated.
Related Issues (20)
- Adding total count to OSM edit report HOT 4
- OSM Bright 2 - Characters Not Loading (in/for Shanghai) HOT 3
- Objects modified count frozen at zero HOT 3
- `osm-edit-report-service.mapbox.com` should return weekly stats
- `osm-edit-report-service.mapbox.com` should take a param `stats`
- JSON object returned by `osm-edit-report-service.mapbox.com` can have more information
- Fix display for durations > 11 months
- Update the README.md w/ screenshot of new UI HOT 1
- Add users through the front-end
- add filter by "type" of edit
- Option to view stats in local time
- errors loading older data HOT 4
- Avoid the overlap of number
- Icon HOT 1
- Update correct OSM user name HOT 1
- Add bkowshik to the edit report HOT 1
- Missing updates since 11th March! HOT 1
- Agregar detalle de los objetos modificados HOT 1
- Counting all data-team editions HOT 5
- New front-end HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from osm-edit-report-deprecated.